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Abstract: 

This  paper  reports  cn  research  comparing  various  approaches,  cr 
methodologies*  for  software  development.  The  study  focuses  cr  the 
quantitative  analysis  of  the  application  of  certain  methodologies 
in  an  experimental  environment,  in  order  to  further  understand 
their  effects  and  better  demonstrate  their  advantages  in  a 
controlled  environment.  A series  of  statistical  experiments  were 
conducted  comparing  programming  teams  that  useo  a disciplined 
methodology  (consisting  of  top-down  design,  process  design 
language  usage,  structured  programming,  code  reading,  and  chirf 
programmer  team  organization)  with  programming  teams  and 
individual  programmers  that  employed  ad  hoc  approaches.  Specific 
details  of  the  experimental  setting,  the  investigative  approach 


(used  to  plan,  execute,  and  analyze  the  experiments),  and  some  c 
the  results  of  the  experiments  are  discussed. 
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In  the  development  of  any  theory*  there  are  three  phases  of 
validation*  First  is  the  logical  development  of  the  theory  based 

on  a set  of  sound  principles.  Second  is  the  application  of  the 

< 

theory  and  the  gathering  of  evidence  that  the  theory  is  applicable 
in  practice.  This  usually  involves  some  qualitative  assessment  in 
the  form  of  case  studies.  The  third  step  is  the  quantitative 
analysis  of  the  application  of  theory  in  an  experimental 
environment  in  order  to  further  understand  its  effect  and  better 
demonstrate  its  advantages  in  a controlled  environment. 

There  has  been  a great  deal  written  about  methodologies  and 
programming  environments  for  developing  software  CWirth  71;  Dahl, 
Dijkstra,  and  Hoare  72;  Jackson  75;  Brooks  75;  Myers  75;  Linger, 
Mills,  and  Witt  793.  It  is  clear  that  many  of  these  methodologies 
are  baseo  on  sound  logical  principles  and  their  adoption  within  a 
production  environment  has  been  successful.  There  have  been  many 
case  studies  that  attempt  to  validate  these  theories;  projects 
have  adapted  versions  of  these  methods  and  have  reported  varying 
degrees  of  success,  i.e.,  the  users  feel  they  got  the  job  done 
faster,  made  less  errors,  or  produced  a better  product  CBaker  75; 
3asili  and  Turner  75;  Daley  773.  Unfortunately,  there  has  been  a 
minimum  of  real  quantitative  evidence  that  comparatively  assesses 
any  particular  methodology  CShneiderman  et  al.  77;  Myers  78; 
Sheppard  et  al.  783.  This  is  partially  because  of  the  cost  and 
i mp ra c t i ca  l i ty  of  a valid  experimental  setup  within  a production 
("real-world")  environment. 

This  leaves  open  the  question  of  whether  there  is  a 
measurable  benefit  derived  from  various  programming  methodologies 
and  environments  with  respect  to  either  the  developed  product  or 
the  development  process.  Even  if  the  benefits  are  real,  it  is  not 
clear  that  they  can  be  quantified  and  effectively  monitored. 
Software  development  is  still  too  much  of  an  art  in  the  aesthtic 
or  spontaneous  sense.  In  order  to  fully  understand  it,  control 
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it,  and  adapt  it  to  particular  applications  anc  situations, 
software  development  must  become  more  of  a science  in  the 
en9ineering  and  calculated  sence.  What  is  reauired  is  more 
empirical  study,  data  collection,  and  experimental  analysis* 

Tne  purpose  of  the  research  reported  in  this  paper  is  (a)  to 
quantitatively  investigate  the  effect  of  methodologies  and 
programming  environments  on  software  development  and  (b)  to 
develop  an  investigative  methodology  based  on  scientific 
experimentation  and  tailored  to  this  particular  application.  It 
involves  the  measurement  and  analysis  of  both  the  process  and  the 
product  in  a manner  which  is  minimally  ootrusive  (to  those 
developing  the  software),  very  objective,  and  highly  automatable. 
The  goal  of  the  research  was  to  verify  the  effectiveness  of  a 
particular  programming  methodology  and  to  identify  various 
quantifiable  aspects  that  could  demonstrate  such  effectiveness. 


To  this  end,  a controlled  experiment  was  conducted  involving 
several  replications  of  a specific  software  development  task  under 
va  ry i n9  programming  environments.  For  each  replication  successive 
versions  of  the  software  under  development  were  entered  in  an 
historical  data  base  which  recorded  details  of  the  development 
process  and  product.  A host  of  measurements  were  extracted  from 
the  cata  base  and  statistically  analyzed  in  order  to  achieve  the 
research  goals.  Some  of  these  measurements  were  "conf i rma tory" , 
as  they  were  planned  in  advance  and  expected  to  show  differences 
among  the  programming  environments  being  investigated,  while  many 
of  the  measurements  were  simply  “exp loratory . " 

The  study  involves  three  distinct  groupings  of  software 
developers:  individual  programmers,  ad  hoc  three-person 
programming  teams,  and  tfiree-person  programming  teams  using  a 
Disciplined  methodology.  The  individual  programmers  and  the  ad 
hoc  teams  were  allowed  to  develop  the  software  in  a manner  of 
their  own  choosing;  this  is  referred  to  as  an  ad  hoc  approach. 

The  disciplined  methodology  referred  to  in  this  paper  consists  of 
an  integrated  set  of  software  development  techniques  and  team 
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organizations  which  include  top-down  designT  use  of  a process 
design  language*  structured  programming*  code  reading*  and  chief 
programmer  teams* 

The  study  examines  differences  in  the  exgggigng^  and  the 
£££3l£l£&l.LilX  of  software  development  behavior  under  the 
programming  environments  represented  by  these  groups. 

The  basic  premise  is  that  distinctions  among  the  groups  exist 
both  in  the  BI2££S£  and  in  the  BE222£i»  With  respect  to  the 
software  development  product*  it  is  believed  that  the  disciplined 
team  should  approximate  the  single  individual  with  regard  to 
product  characteristics  (such  as  number  of  decisions  coded  and 
glcgal  data  accessibility)  and  at  the  very  least  lie  somewhere 
between  the  single  individual  and  the  ad  hoc  team.  This  is 
because  the  disciplined  methodology  should  help  in  making  the  team 
act  as  a mentally  cohesive  unit  during  the  design*  coding*  and 
testing  phases.  Ji  th  respect  to  the  software  development  process* 
the  disciplined  team  shoulo  have  advantages  over  both  individuals 
anc  ad  hoc  teams,  displaying  superior  performance  on  cost  factors 
such  as  computer  usage  and  number  of  errors  made.  This  is  because 
of  the  discipline  itself  and  because  of  the  ability  to  use  team 
members  as  a resource  for  validation. 

The  study's  findings  reveal  several  programming 
characteristics  for  which  statistically  significant  differences  do 
exist  among  the  groups.  The  disciplined  teams  used  fewer  computer 
runs  and  apparently  made  fewer  errors  during  software  development 
than  either  the  individual  programmers  or  the  ad  hoc  teams.  The 
individual  programmers  and  the  disciplined  teams  both  produced 
software  with  essentially  the  same  number  of  decision  statements* 
but  software  produced  oy  the  ad  hoc  teams  contained  greater 
numbers  of  decision  statements.  For  no  characteristic  was  it 
concluded  that  the  disciplined  methodology  impaired  the 
effectiveness  a programming  team  or  diminished  the  quality  of 
the  software  product. 
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The  i nve s t i 3 a t i on  has  been  conducted  in  a laboratory  or 
p rc  v i nb-g  round  fashion,  in  order  to  achieve  some  statistical 
significance  and  scientific  respectability  without  sacrificing 
prcoucticn  realism  and  professional  applicability.  By  scaling 
do»n  a tyoical  production  environment  while  retaining  it.; 
important  characteristics,  the  laboratory  setting  provides  for  a 
reasonable  compromise  oetween  the  extremes  of 
(a)  "toy"  experiments, 

which  can  afford  elaborate  experimental  designs  and  large 
sample  sizes  but  often  suffer  from  a basic  task  that  is 
rather  unrelated  to  production  situations  or  involve  a 
context  from  which  it  is  difficult  to  extrapolate  or  scale  up 
(e.g.,  introductory  computer  course  students  taking 
multiple-choice  quizzes  basea  on  thirty-line  programs), 
and  (b)  "production  environment"  experiments, 

which  offer  a high  degree  of  production  realism  by  definition 
but  incur  prohibitively  high  costs  even  for  the  simplest  and 
weakest  experimental  designs  (i.e.,  statistical 
e *pe r i ment a t i on  requires  replication,  and  multiple 
duplication  of  a nontrivial  programming  project  is  clearly 
expensive) . 

The  experiment  in  this  study  was  conducted  within  an  academic 
environment  where  it  was  possible  to  achieve  an  adequate 
experimental  design  ano  still  simulate  key  elements  of  a 
prcoucticn  environment. 

An  initial  phase  of  investigation  has  been  completed  and  the 
complete  results  are  presented  in  the  remainder  of  this  paper. 
Section  II  gives  details  pertaining  to  the  experiment  itself. 
Section  ill  describes  the  investigative  methodology  used  to  plan, 
execute,  and  analyze  the  experiment.  Sections  IV  and  V present 
the  experiment's  results,  segregated  into  empirical  findings 
(resulting  from  statistical  analysis  of  the  measurements)  and 
intuitive  judgements  (resulting  from  interpretat  ion  of  some  of  the 
empirical  findinas),  respectively.  Section  VI  contains  some 
remarks  on  this  initial  phase  of  investigative  effort  ano  a 
discussion  of  further  work  plaVined  for  the  study. 
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It  should  oe  noted  that  the  terms  "methodology'  and 
"methodological"  (in  reference  to  software  development)  are 
consistently  used  throughout  this  report  with  a technical  meaning 
related  to  the  concept  of  a comprehensive  integrated  set  of 
development  techniques  as  well  as  team  organi za t ions t rather  than 
to  the  more  common  notion  of  a particular  technique  or 
organization  in  isolation. 
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Tin's  secticn  descrioes  several  aspects  of  the  experiment 
itself;  namely,  the  experimental  design  or  setup,  the  experimental 
environment,  data  collection  and  reduction  (during  and  subsequent 
to  the  experiment),  and  the  programming  aspects  and  associated 
metrics  (used  to  quantify  the  experiment)* 

2££2&£SSe tug 

The  major  facets  of  the  experimental  design  are  the 
experimental  units,  the  experimental  treatment  factors,  the 
experimental  treatment  factor  levels,  the  experimental  variables 
observed,  the  experimental  local  control,  and  the  experimental 
management  of  other  factors.  (See  COstle  and  flensing  75;  Chapter 
9]  for  a thorough  presentation  of  these  facets.)  An  experimental 
unit  is  that  unit  to  which  a single  treatment  (which  may  be  a 
comoination  of  several  factor  levels)  is  applied  in  one 
replication  of  the  basic  experiment.  In  this  case,  the  basic 
experiment  was  the  accomplishment  of  a given  software  development 
project,  and  the  experimental  unit  was  the  software  development 
teem,  i.e.,  a small  group  of  people  who  worked  together  to  develop 
the  software.  There  was  a total  of  19  such  units  involved  in  the 
experiment. 

In  most  experiments,  attention  is  focused  on  one  or  more 
independent  variables  ana  on  the  behavior  of  a certain  dependent 
variable(s)  as  the  indepencent  variables  are  oermitted  to  vary. 
These  inoependent  variables  are  known  as  experimental  treatment 
factors.  This  experiment,  fccusea  on  two  particular  facets  of 
software  development,  (1)  siae  of  the  development  team  and  (2) 
degree  of  methodological  discipline,  as  the  experimental  treatment 
f ac  tors. 

Post  experiments  involve  some  deliberate  differential 
variation  in  the  experimental  treatment  factors.  The  various 
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values  or  classifications  of  the  factors  are  known  as  the  levels 
of  the  experimental  treatment  factors.  In  this  experiment*  two 
levels  were  selected  for  each  factor.  For  the  size  factor*  the 
levels  were  single  individuals  ana  three-person  teams.  For  the 
degree-of -d i sc i p l ine  factor*  the  levels  were  an  ad  hoc  approach 
and  a disciplined  methodology. 

The  experimental  (dependent)  variables  observed  consisted  of 
13C  programming  aspects  relating  to  the  software  product  and 
development  process.  Technically*  this  created  a whole  series  of 
simultaneous  univariate  experiments*  all  having  the  same  common 
experimental  design  and  all  based  on  the  same  data  sample*  with 
one  experiment  for  each  programming  aspect.  The  immediate  goal  of 
an  experiment  is  to  Learn  something  about  the  relationship  between 
the  experimental  treatment  factor  levels  and  the  observed 
variables. 

Experimental  local  control  refers  to  the  configuration  by 
which  (a)  experimental  units  are  obtained*  (b)  certain  sets  of 
units  .re  placed  into  groups*  and  (c)  these  different  groups  are 
subjected  to  certain  comoinations  of  experimental  factor  levels. 
Local  control  is  employed  in  the  design  of  an  experiment  in  order 
to  increase  the  statistical  efficiency  of  the  experiment  (or 
sens i t i vi ty /powe r of  the  statistical  test).  Experimental  local 
control  usually  incorporates  some  form(s)  of  randomization  — a 
basic  principle  of  experimental  design — since  it  is  necessary  for 
the  validity  of  statistical  test  procedures. 

For  this  experiment,  subjects  were  obtained  simply  on  the 
basis  of  course  enrollment.  Since  the  experiment  was  completely 
emceoded  within  two  academic  courses*  every  student  in  those 
courses  automatically  particioated  in  the  experiment.  Development 
"teams'*  were  formed  among  the  subjects:  in  one  course*  the 
students  were  allowed  to  choose  between  segregating  themselves  as 
individual  programmers  or  combining  with  two  other  classmates  as 
three-person  programming  teams;  in  the  other  course*  the  students 
were  assigned  (by  the  researchers)  into  three-person  teams. 
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Experimental  units  were  formed  and  placed  into  groups  in  this 
manner  because  the  two  academic  courses  themselves  provioed  the 
twc  levels  of  the  second  experimental  treatment  factor.  This 
crccess  yielded  three  groups  of  6,  6,  and  7 units,  designated  AI, 
AT,  and  DT,  respectively.  Each  group  was  exposed  to  a particular 
combined  factor-level  treatment  according  to  the  following  partial 
factorial  arrangement:  (Al)  single  individuals  using  an  ad  hoc 
approach,  (AT)  three-person  teams  using  an  ad  hoc  approach,  and 
(D7>  three-person  teams  using  specific  state-of-the-art 
methodologies. 

The  disciplined  methodology  imposed  on  teams  in  group  PT 
consisted  of  an  integrated  set  of  techniques,  including  top  down 
design  of  the  problem  solution  using  a °rocess  Design  Language 
(pDL)»  functional  expansion,  design  and  code  reading, 
walk-throughs,  and  chief  programmer  and  manager  teams.  These 
techniques  and  organi zat ions  were  taught  as  an  integral  part  of 
the  course  that  the  suojects  were  taking.  The  course  material  was 
organized  around  Clinger,  Mills,  and  Witt  79],  CBasili  and  Baker 
75:,  and  Curooks  75]  as  textbooks.  Since  the  subjects  were 
novices  in  the  methodology,  they  executed  the  techniques  and 
organizations  to  varying  degrees  of  thoroughness  and  were  not 
always  as  successful  as  seasoned  users  of  the  methodology  would 
be  . 


Specifically,  the  disciplined  methodology  prescribed  the  use 
of  a PDL  for  expressing  the  design  of  the  problem  solution.  The 
design  was  expressed  in  a top-down  manner,  each  level  representing 
a solution  to  the  problem  at  a particular  level  of  abstraction  and 
specifying  the  functions  to  be  expanded  at  the  next  level.  The 
PDL  consisted  of  a specific  set  of  structured  control  ano  cata 
structures,  plus  an  open-ended  designer-def ineo  set  of  operators 
anc  operands  corresponding  to  the  level  of  the  solution  and  the 
particular  application.  Design  and  code  reading  involved  the 
critical  review  of  each  team  member's  PDL  or  code  by  at  least  one 
other  member  of  the  team.  Walk-throughs  represented  a more 
formalized  presentation  of  an  individual's  work  to  the  other 
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memce  r s of  the  team  in  which  the  POL  or  code  was  explained  step  by 
stef,.  In  the  chief  programmer  teams*  the  chief  programmer  defined 
the  to*,  level  solution  to  the  problem  in  ^DL,  designed  and 
implemented  the  key  code  himself*  and  assigned  subtasks  to  each  of 
the  other  two  programmers  who  code  read  for  the  chief  programmer, 
designed  or  codea  subpieces  as  requested  by  him*  and  performed 
librarian  activities  (i.e.»  entering  or  revising  code  stored 
on-line*  making  test  runs*  etc.).  The  manager  teams  were  defined 
in  a similar  fashion*  except  that  the  manager  also  acted  as 
librarian*  writing  less  code  and  doing  more  code  reading*  and 
yielded  much  greater  responsibility  for  design  and  implementation 
to  the  other  members  of  the  team. 

Each  individual  or  team  in  groups  Al  and  AT  was  allowed  to 
develop  the  software  in  a manner  of  their  own  choosing*  which  is 
referred  to  in  this  paper  as  an  ad  hoc  approach.  No  methodology 
was  taught  in  the  course  these  subjects  were  taking.  Informal 
observation  by  the  experimenters  confirmed  that  the  approaches 
used  by  the  individuals  and  ad  hoc  teams  were  indeed  lacking  in 
discipline  and  did  not  utilize  the  key  elements  of  the  disciplined 
methodology  (e.g.*  an  individual  working  alone  cannot  practice 
code  reading,  and  it  was  evident  that  the  ad  hoc  teams  did  not  use 
a P2L  or  formally  do  a top-down  design). 

There  are  usually  several  extraneous  factors*  other  than  the 
ones  identified  as  experimental  treatment  factors*  which  could 
influence  the  behavior  being  observed.  The  experimental  design 
employed  three  distinct  methods  to  control  various  extraneous 
factors.  Factors  were  either  fixed  (artifically  or  externally 
held  constant  across  all  experimental  units),  oalanceo 
(artificially  or  externally  distributed  as  evenly  as  possible 
among  the  experimental  units),  or  randomized  (allowed  to  vary  in  a 
naturally  random  way  among  the  experimental  units).  In  this 
experiment,  a variety  of  programming  factors  which  do  affect 
software  development  were  given  conscious  consideration  as 
extraneous  variables  and  controlled  as  follows: 

- personal  ability/talent  of  people:  randomized 
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(and  balanced  within  disciplined  teams) 

- project/task/application:  fixed 

- project  specifications:  fixed 

- implementation  language:  fixed 

- calendar  scheaule:  fixed 

- available  computer  resources:  fixed 

- available  man-hour  resources:  randomized 

- available  automated  tools:  fixed 

wherever  possible,  these  variables  were  held  constant  by 
explicitly  treating  all  experimental  units  in  the  same  manner. 

Two  variables,  the  personal  ability  of  the  participants  and  the 
amount  of  actual  time  they  (as  students  with  other  classes  and 
responsibilities)  had  to  devote  to  the  project,  could  only  be 
allowed  to  vary  among  the  groups  in  what  was  assumed  to  be  a 
random  manner.  however,  information  from  a questionnaire  was  used 
to  Balance  the  personal  ability  of  the  participants  in  the 
disciplined  teams  (only)  by  first  (a)  partitioning  the  group  OT 
students  into  three  equal-sized  categories  (high,  medium,  low) 
based  on  their  grades  in  previous  computer  courses  and  their 
ex t ra cur r i cu la r programming  experience,  and  then  (b)  assigning 
them  to  teams  by  randomly  selecting  one  student  from  each  category 
to  form  each  team. 

E nyir gnme nt 

Several  particulars  of  the  experimental  environment 
contripute  s i gn i f i c an t l y to  the  context  in  which  the  experiment's 
results  must  be  appraised.  These  include  the  time  and  place  the 
experiment  was  conducted,  the  software  development  project  (or 
application)  which  served  as  the  task  performed  during  the 
experiment,  the  people  who  participated  as  subjects,  and  the 
computer  programming  language  in  which  the  software  was  written. 

The  experiment  was  conducted  during  the  Spring  1976  semester, 
January  through  May,  within  the  regular  academic  courses  given  by 
the  Department  of  Computer  Science  on  the  College  Park  campus  of 
the  University  of  Maryland. 
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Several  general  characteristics  of  the  project  are 
noteworthy.  The  application  was  a compiler*  involving  string 
processing  and  language  translation  (via  scanning*  parsing*  cede 
generation*  and  symbol  table  management).  The  scope  of  the 
project  excluded  both  extensive  error  handling  and  user 
documentation.  The  project  difficulty  was  slight  but 
nonneg  l ig ib l e * requiring  roughly  a two  man-month  effort.  The  size 
of  the  resulting  system  averaged  over  1200  lines  of  high-level 
(structured  language)  source  code.  The  total  task  was  to  design* 
implement*  test*  and  debug  the  complete  computer  software  system 
given  a particular  specification.  All  aspects  of  the  project  were 
fixeo  ana  uniform  for  each  of  the  development  teams.  Each  team 
worked  independently  to  Duild  its  own  system*  using  identical  (1) 
specifications*  (2)  computer  resources  allocated*  (3)  duration  of 
calendar  time  allotted*  (4)  implementation  language*  (5)  testing 
tools*  etc. 

The  participants  were  advanced  undergraduate  students  and 
graduate  students  in  the  Department  of  Computer  Science.  None 
were  novice  programmers,  all  had  completed  at  least  four  semesters 
of  programming  course  work*  several  were  about  to  graduate  and 
take  programming  jobs  in  government  or  industry*  and  a few  even 
had  as  much  as  three  years"  professional  programming  experience. 

On  the  whole,  the  participants  might  best  be  described  as 
"advanced  student  programmers  with  a bit  of  professional 
experience."  The  experiment  was  conducted  within  the  framework  of 
two  comparable  advanced  elective  courses*  each  with  the  same 
academic  prerequisites.  The  project  and  the  experimental 
treatments  were  built  into  the  course  material  and  assignments* 
and  everyone  in  the  two  classes  participated  in  the  experiment. 
They  were  aware  of  being  monitored,  but  had  no  knowledge  of  what 
was  teing  observed  or  why.  A reasonable  degree  of  homogeneity 
seemed  to  exist  among  the  participants  with  respect  to  personnel 
factors*  such  as  ability*  experience*  motivation*  time/effort 
devoted  to  the  project,  etc.  On  the  whole,  they  were  typically 
average  in  each  of  these  factors  with  natural  fluctuations  which 
appeared  to  be  evenly  distributed  amon*  the  experimental  groups  in 
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a random  fashion*  Based  upon  pre-experiment  Qualitative  judgment* 
all  subjects  shared  a similar  background  with  respect  to  team 
programming  and  the  disciplined  methodology*  However*  groups  AI 
ano  AT  (the  individuals  ano  ad  hoc  teams)  seemed  to  have  had  a 
slight  eoge  over  group  CT  (the  disciplined  teams)  with  respect  to 
general  programming  ability  and  formal  training  in  the  application 
area* 


The  implementation  language  was  the  high-level* 
non-fc  loc k-st rue tured,  structured-programming  language  SIMPL-T 
CSasili  ano  Turner  763*  This  language  was  designed  ana  aeveloped 
at  the  University  of  Ma  ry land  where  it  is  taught  and  useo 
extensively  in  regular  Department  of  Computer  Science  courses*  It 
is  cha rac te r i zed  by  a very  simple  and  efficient  run-time 
environment*  SIMPL-T  contains  the  following  control  constructs: 
sequence*  ifthenelse*  whiledo*  case*  and  exits  from  loops  (but  no 
gotos)*  The  language  adheres  to  a philosophy  of  "strong  data 
typing"  and  all  variables  must  be  explicitly  declared.  It 
provides  the  programmer  with  both  automatic  recursion  ana 
s t r ing-c race s s i ng  capabilities  similar  to  PL/1. 

tali  Collection  gnd  ion 

Due  to  the  partially  exploratory  nature  of  the  experiment  in 
terms  of  differences  to  De  discovered  in  the  project  and  process* 
as  much  information  was  collected  as  could  be  done  in  an 
efficient,  effective*  and  unobtrusive  manner*  A variety  of 
information  sources  was  used.  Individual  questionnaires  supplied 
the  personal  background  and  programming  experience  of  each 
participant*  Private  team  interviews  ano  open-class  team  reports 
yielueu  information  regarding  individual  performance  on  the 
project.  Run  logs  and  computer  account  Dilling  reports  gave  a 
record  of  the  computer  activity  during  the  project.  Special 
moc«*le  compilation  and  program  execution  processors  (invoked 
on-line  via  very  slight  changes  to  the  regular  command  language) 
createc  an  historical  data  base  of  source  code  and  test  data 
accumulated  throughout  the  project  development. 
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The  data  base  provides  the  principal  source  of  information 
analyzed  in  the  current  investigation  and  other  information 
sources  have  been  utilized  only  in  an  auxiliary  manner  (if  at 
all)*  Thus*  data  collection  for  the  experiments  themselves  was 
automated  on-line*  with  essentially  no  interference  to  the 
programmer's  normal  pattern  of  actions  during  computer  (terminal) 
sessions*  The  final  products  were  isolated  from  the  data  base  and 
measured  for  various  syntactic  and  organizational  aspects  of  the 
finished  product  source  code*  Effort  and  cost  data  were  also 
extracted  from  the  data  base*  The  inputs  to  the  analysis*  in  the 
form  of  scores  for  the  various  programming  aspects*  reflect  the 
quantitatively  measured  character  of  the  product  and  effort  of  the 
process*  Much  of  the  data  reduction  was  done  automatically  within 
a specially  instrumented  compiler*  Some  was  done  manually  (e.g** 
examining  characteristics  across  modules)*  Due  to  the  underlying 
collection  and  reduction  mechanism*  which  was  uniformally  applied 
to  all  experimental  units*  the  data  used  in  the  analysis  has  the 
c hara c t er i s t i cs  of  objectivity*,  uniformity*  and  quantitativeness 
and  is  measured  on  an  interval  scale  of  measurement  CConover  71; 
pp.  65-673. 

P ro^r  a mm  i ng  A S2££l§  j»ng  M£l£i£§ 

The  dependent  variables  studied  in  this  experiment  are  called 
programming  aspects.  They  represent  specific  isolatable  and 
observable  features  of  the  programming  phenomenon  which  are  highly 
automatable  (i*e*>  they  could  be  extracted  or  computed  directly 
on-line  from  information  readily  obtainable  from  operating  systems 
and  compilers).  The  variables  fall  into  two  categories:  process 
aspects  and  product  aspects.  Process  aspects  are  relateo  to 
characteristics  of  the  development  process  itself*  in  particular* 
the  cost  and  required  effort  as  reflected  in  the  number  of 
computer  job  steps  (or  runs)  and  the  amount  of  textual  revision  of 
source  code  during  development.  Product  aspects  are  related  to 
the  syntactic  content  and  organization  of  the  symbolic  source  code 
which  represents  the  complete  final  product  that  was  developed. 
Examples  are  number  of  lines*  frequency  of  particular  statement 
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types,  average  size  of  data  variables'  scope,  etc.  For  each 
aspect  there  exists  an  associated  metric,  a specific  algorithm 
which  ultimately  defines  that  aspect  and  by  which  it  is  measured. 

Tne  particular  p rog rarami n„  aspects  examined  in  this 
investigation  are  listed  in  Table  1.  They  appear  grouped  by 
category;  indented  qualifying  phrases  specify  particular  variants 
of  certain  general  aspects.  When  referring  in  this  paper  to  an 
inuividual  (sub)aspect,  a concatenat i on  of  the  heading  line  with 
the  qualifying  phrases  (separated  by  \ symbols)  is  used;  for 
example,  COMPUTER  JOB  STEPS \M0 DULE  CO*PILATIONS\UNIQUE  denotes  the 
nuiroer  of  COMPUTER  JOB  STEPS  that  were  MODULE  COMPILATIONS  in 
which  the  source  code  was  UNIQUE  from  all  other  compiled  versions. 
Explanatory  notes  (keyed  to  the  list  in  Table  1)  about  the 
programming  aspects  are  given  in  Appendix  1,  complete  with 
definitions  for  the  nontrivial  or  unfamiliar  metrics.  Technical 
meanings  for  various  system-  or  language-depenaent  terms  used  in 
the  paper  (e.g.,  module,  segment,  intrinsic,  entry)  also  appear 
there.  Some  of  these  words  mean  different  things  to  different 
people,  and  the  reader  is  cautioned  against  drawing  inferences  not 
based  on  this  paper's  definitions. 

Tne  complete  set  of  programming  aspects  may  be  partitioned 
into  two  subsets  based  upon  the  motivation  for  their  inclusion  in 
the  study.  Several  aspects  — hereafter  denoted  as 
h£2D1  iriaisr^’* — had  been  consciously  planned  in  advance  of 
collecting  and  extractin9  the  data,  because  intuition  suggested 
that  they  would  serve  well  as  quantitative  indicators  of  important 
qualitative  characteristics  of  the  sofware  development  phenomenon. 
It  was  predicted  a priori  that  these  "confirmatory"  aspects  would 
verify  the  study's  basic  premises  regarding  the  programming 
me t ncdo log i es  being  investigated  in  the  experiment.  The  remaining 
aspects  --hereafter  denoted  as  "gxgioratory"--  were  considered 
mainly  because  they  could  be  collected  and  extracted  cheaply  (even 
as  a natural  by-product  sometimes)  along  with  the  "confirmatory" 
aspects.  There  was  little  serious  expectation  that  these 
"exploratory"  aspects  would  be  useful  indicators  of  differences 
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Table  1.  Programming  Aspects 


N.E.  The  asterisks  to  the  left  mark  "confirmatory"  aspects; 
"exploratory"  aspects  are  unmarked.  The  parenthesized  numbers 
to  the  right  refer  to  the  explanatory  notes  in  Appendix  1. 


Development  process  aspects  : 

COMPUTES  JOS  STEPS 

MODULE  COMPILATIONS 
UNIQUE 
IDENTICAL 

PROGRAM  EXECUTIONS 
MISCELLANEOUS 

ESSENTIAL  JOB  STEPS 

AVEPAGE  UNIQUE  COMPILATIONS  PER  MODULE 
MAX  UNIQUE  COMPILATIONS  F.A.O.  MODULE 

PROGRAM  CHANGES 

final  product  aspects  : 

MODULES 

SEGMENTS 

SEGMENT  TYPE  COUNTS  : 

function 

PROCEDURE 

SEGMENT  TYPE  PERCENTAGES  : 

FUNCTION 

PROCEDURE 

AVEPAGE  SEGMENTS  PEP  MODULE 

lines]' 

statements 

statement  type  counts  : 

IF 

CASE 

while 

EXIT 

(PROC ) CALL 

NO  INTRINSIC 
INTRINSIC 
RETURN 

STATEMENT  TYPE  PERCENTAGES  : 

IF 

CASE 

WHILE 

EXIT 

(PROC)CALL 

NONINTRINSIC 

INTRINSIC 

RETURN 

AVEPAGE  STATEMENTS  PER  SEGMENT 


AVERAGE  STATEMENT  NESTING  LEVEL 

222X2222222222222222X2222222222=2: 

DECISIONS 

I 2222222222222222222222222222222  = 2: 

'FUNCTION  CALLS 
NONINTRINSIC 
INTRINSIC 

222222232233222222222222222222X22: 


(16) 

(17) 

(IS) 

(19) 

(20) 

(21) 

(22)  (44) 

(23)  (44) 

(23 )  (44 ) 

(24) 


(22X44) 
(23)  (44) 
(23)  (44) 
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TOKENS 

AVEPAGE  TOKENS  PER  STATEMENT 

32S23332333S333SSX222SS223S3333SS33SS33S 

INVOCATIONS 

FUNCTION 

NONINTRINSIC 

INTRINSIC 

PROCEDURE 

NONINTRINSIC 

INTRINSIC 

NONINTRINSIC 

INTRINSIC 


AVG  INVOCATIONS  PER  (CALLING)  SEGMENT 
FUNCTION 

NONINTRINSIC 

INTRINSIC 

OROCEDURE 

NONINTRINSIC 

INTPINSIC 

NONINTRINSIC 

INTRINSIC 


AVG  INVOCATIONS  PER  (CALLED)  SEGMENT 
FUNC  TION 
PROCEDURE 


DATA  VARIABLES 


DATA  VARIABLE  SCOPE  COUNTS  : 
GLOBAL 
ENTRY 

MODIFIED 
UNMODIFIED 
NONEN TRY 
MODIFIED 
UNMODIFIED 
MODIFIED 
UNMODIFIED 
NONGLOBAL 
PARAMETER 
VALUE 
REFERENCE 
LOCAL 


DATA  VARIABLE  SCOPE  PERCENTAGES  : 

global 

ENTRY 

MODIFIED 

UNMODIFIED 

NONENTRY 

MODIFIED 

UNMODIFIED 

MODIFIED 

UNMODIFIED 

NONGLOEAL 

PARAMETER 

VALUE 

PEFERENCE 

LOCAL 


AVERAGE  GLOBAL  VARIABLES  D ER  MODULE 
ENTRY 
NONENTRY 
MODIFIED 
UNMODIFIED 


AVERAGE  NONGLGBAL  VARIABLES  PER  SEGMENT 
PARAMETER 
LOCAL 


(23) 

(28) 

(29) 

(11 ) (44) 
(23) (44) 
(23) (44) 
(11K44) 
(23) (44) 
(23) (44) 
(22) 

(23) 

H?1 

(23) 

(23) 

(11 ) 

(23) 

(23) 

(23) (44) 
(23) 

(31 )  (44) 
(11) 

(11  ) 

(32) 

(37) 

(33) 

(34) 

(35) 

(35) 

(34) 

(35) 

(25) 

(35) 

(35) 

(33) 

(33) 

(26) 

(26) 

(23) 

(37) 

(33) 

(34) 

(25) 

(35) 

(34) 

(35) 

(35) 

(35) 

(35) 

(33) 

(33) 

(26) 

(36) 

(23) 

(26) 

(34) 

(34) 

(35) 

(25) 

(28) 

(33) 

(33) 
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PARAMETER  PASSAGE  TYPE  PERCENTAGES  : 
VALUE 

REFERENCE  • 

c seg^glOeal)  actual  usage  pairs” 

ENTRY 

MODIFIED 

UNMODIFIED 

NONENTRY 

MODIFIED 

UNMODIFIED 

MODIFIED 

UNMODIFIED 


(seg, global)  possible  usage  pairs 
entry 

modified 

unmodified 

nonentry 

modified 

unmodified 

MODIFIED 

UNMODIFIED 


(SEG, GLOBAL)  USAGE  RELATIVE  PERCENTAGES 
ENTRY 

MODIFIED 

UNMODIFIED 

NONENTRY 

MODIFIED 

UNMODIFIED 

MODIFIED 

UNMODIFIED 

(SEG, GLOBAL, SEG)  DATA  BINDINGS  : 

ACTUAL 

SUBFUNCTIONAL 

INDEPENDENT 

POSSIBLE 

RELATIVE  PERCENTAGE 

A***************************************** 


(39) 
(36) 
(36) 

(40) 

(34) 

(35) 
(35) 

(34) 

(35) 

Hi! 

(35) 

ass 

(35) 

(35) 

(34) 

(35) 
(35) 
(35) 
(35) 

(43) 

(34) 

(35) 
(35) 

(34) 

(35) 
(35) 
(35) 
(35) 

(41  ) 

(42) 

(43) 
(43) 
(42) 
(42) 
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among  the  groups;  but  they  were  included  in  the  study  with  the 
intent  of  observing  as  many  aspects  as  possible  on  the  off  chance 
of  discovering  any  unexpected  tendency  or  difference.  The 
"cent  irmatory"  programming  aspects  are  identified  by  being  flagged 
in  Table  1 with  an  asterisk;  the  "explorat,.*y"  programming  aspects 
are  unflagged. 

This  distinction  between  "confirmatory"  and  "exploratory"  has 
important  consequences  for  the  evaluation  of  the  study's 
experiments.  For  the  "confirmatory"  aspects*  the  individual 
experiments  are  actually  confirmatory,  since  it  was  hypothesized 
that  they  would  indicate  certain  differences  among  the  groups, 
prior  to  conducting  the  experiment  and  extracting  their  scores. 

But  for  the  "exploratory"  aspects,  whose  scores  were  extracted 
without  any  preconcieved  hypotheses,  the  experiments  are  purely 
exploratory.  Thus,  this  study  combines  elements  of  both 
confirmatory  and  exploratory  data  analysis  within  one  common 
experimental  setting  CTukey  615}  . This  distinction  does  not 
however  influence  the  method  by  which  the  experiments  themselves 
.ere  conducted. 

It  should  be  noted  that  a large  percentage  of  the  product 
aspects  fall  into  the  "exploratory"  category.  A secondary 
motivation  for  their  cons i oe r at i on  is  that  the  product  aspects,  as 
•)  a unit,  represent  a fairly  extensive  I^xongm*  of  the  surface 

features  of  software.  The  idea  that  important  software  qualities 
(e.g.,  "complexity")  could  be  measured  by  counting  such  surface 
features  has  generally  been  disregarded  by  reasearchers  as  too 
simplistic  (e.g.,  Chills  73;  o.  2323).  A resolve  to  study  these 
surface  features  empirically,  to  see  if  something  might  turn  up, 
before  rejecting  the  underlying  iaea,  -as  partially  responsible 
for  their  inclusion  in  the  stuuy. 

In  order  to  avoid  any  inadvertent  deception  or 
misunderstanding,  the  following  issue  of  redundancy  must  be  stated 
and  properly  appreciated.  There  exist  several  instances  of 
duplicate  programming  aspects;  that  is,  certain  logically  uniqu 

I I 
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aspects  appear  a second  time  with  another  nanet  in  order  to 
provide  alternative  views  of  the  same  metric  and  to  achieve  a 
certain  decree  of  completeness  within  a set  of  related  aspects* 

For  example,  the  FUNCTION  CALLS  aspect  and  the  STATEMENT  TYPE 
COl>MS\(PROC)CALL  aspect  are  listed  (and  categorized  appropriately) 
from  the  viewpoint  of  the  various  type  of  constructs  that  comprise 
the  implementation  language*  3ut  the  very  same  metrics  can  be 
considered  from  the  unifying  viewpoint  of  the  various  subtype 
frequencies  for  segment  invocations,  and  thus  it  is  desirable  to 
include  the  duplicate  aspects  INVOCATIONS\FUNCTIONS  and 
INVOCATIONS\PROCEDURES  as  part  of  the  natural  categorization  of 
INVOCATIONS.  within  the  137  programming  aspects  listed  in  Table 
1,  there  are  seven  pairs  of  duplicate  aspects  (identified  in  the 
notes  of  Appendix  1),  leaving  130  nonredundant  aspects  examined  in 
the  study.  By  definition,  the  data  scores  obtained  for  any  pair 
of  duplicate  aspects  will  be  indentical,  and  thus  the  same 
statistical  conclusions  will  be  reached  for  both  aspects.  This 
must  be  kept  in  mind  when  evaluating  the  results  of  the 
experiments  in  terms  of  their  statistical  impact. 
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III.  Afigroach 


This  section  describes  the  steps  in  an  investigative 
methcdolojy  developed  tor  the  particular  problem  of  comparing 
software  development  efforts  under  various  conditions.  It  was 
used  to  guide  the  planning*  execution*  and  analysis  of  the 
experimental  investigations  whose  results  are  reported  in  this 
paper . 


The  investigative  methodology  can  be  characterized  as  an 
empirical  study  based  on  the  "construction"  paradigm  in  which 
multiple  subjects  are  closely  monitored  during  actual  "production" 
experiences*  each  suoject  performing  the  same  task*  with 
controlled  variation  in  specific  variables.  It  uses  scientific 
experimentation  and  statistical  analysis  based  on  a 
"differentiation  among  groups  by  aspects"  paradigm  in  which 
possible  differences  among  the  groups*  as  indicated  by  differences 
in  certain  quantitatively  measured  aspects  of  the  observed 
phenomenon,  are  the  target  of  the  analysis.  This  use  of 
"difference  a i sc r i mi na t ion"  as  the  analytical  technique  dictates  a 
moael  of  homogeneity  hypothesis  testing  that  influences  nearly 
every  element  of  the  methodology. 

Note  that  there  are  other  analysis  techniques  that  could  have 
been  used;  e.g.*  estimation  of  magnitude  of  difference* 
correlations  between  various  aspects  (across  all  combinations  of 
factor-levels),  multivariate  analysis  (rather  than  multiple 
univariate  analyses  in  parallel)*  and  factor  analysis  (breakdown 
of  variance)  among  the  various  aspects.  These  are  useful 
techniques  and  may  be  used  in  later  phases  of  this  research. 
However,  difference  discrimination  represents  a "first-cut"  probe, 
which  hopefully  will  yield  some  information  to  help  guide  more 
refined  probes  in  the  future. 

Although  the  methodology  is  built  around  an  empirical  study 
anc  utilizes  scientific  experimentation*  the  actual  execution  of 
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th?  experiments  and  collection  of  aata  play  a small  role  in  the 
overall  methodology  when  compared  to  the  planning  and  analysis 
phases.  This  is  readily  apparent  from  the  Approach  Schematic* 
Diagram  1*  which  charts  some  of  the  relationships  among  the 
various  elements  (or  steps)  of  the  investigative  methodology. 

The  remainder  of  this  chapter  outlines  and  briefly  describes 
the  overall  approach  by  defining  each  step  in  general  and 
discussing  how  the  approach  was  applied  in  the  research  effort  at 
hand.  Note  that  Sections  IV  and  V give  the  specifics  of  the  last 
two  steps  of  the  approach  — statistical  conclusions  and  research 
interpretations — as  pertaining  to  the  current  experimental 
investigation. 

Step  1:  QySSllflQS  fil  IQISISSI 

Several  ouestions  of  interest  were  initiated  and  refined  so 
that  answers  could  be  given  in  the  form  of  statistical  conclusions 
and  research  interpretations.  Questions  were  formulated  on  the 
oasis  of  several  areas  of  interest:  (1)  software  development 
rather  than  software  maintenance*  (2)  a particular  set  of 
programming  factors*  (3)  quantitatively  measurable  aspects  of  the 
process  and  the  product*  (A)  two  particular  levels  for  each  of  the 
programming  factors*  (5)  the  particular  type  of  analysis  technic ue 
mentioned  above*  and  (6)  intuitive  considerat ions  and  suspicions 
leaoing  to  choice  of  a particular  three-way  grouping  of  the 
factor-level  combinations. 

The  final  questions  of  interest  culminated  in  the  form 
"Ourin,,  software  development*  what  comparisons  between  the  effects 
of  the  three  factor-level  combinations  (a)  single  individuals*  fb> 
ad  hoc  teams*  and  (c)  disciplined  teams  appear  as  differences  in 
the  various  quant i tat i ve ly  measurable  aspects  of  the  software 
development  process  and  product?  Furthermore*  what  kind  of 
differences  are  exhibited  and  what  is  the  direction  of  these 
d i f f e fences?" 


Diagram  1.  Approach  Schematic 
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Step  2:  Research  tj*Cflih£S£S 

Since  the  investigative  methodology  involves  hypothesis 
testing*  it  is  necessary  to  have  fairly  precise  statements#  called 
research  hypotheses#  which  are  to  be  either  supported  or  refuted 
by  the  evidence.  The  second  step  in  the  approach  was  to  formulate 
these  research  hypotheses*  disjoint  pairs  designated  null  and 
alternative*  from  the  questions  of  interest. 

A precise  meaning  was  given  to  the  notion  "what  kind  of 
difference."  The  investigation  considered  both  (a)  differences  in 
central  tendency  or  average  value*  and  (b>  differences  in 
variability  around  the  central  tendency*  of  observed  values  of  the 
quantifiable  programming  aspects.  It  should  be  noted  that  this 
decision  to  examine  both  location  and  dispersion  comparisons  among 
the  experimental  groups  brought  a pervasive  duality  to  the  entire 
investigation  (i.e.*  two  sets  of  statistical  tests*  two  sets  of 
statistical  results*  two  sets  of  conclusions*  etc.  --always  in 
parallel  and  independent  of  each  other — )*  since  it  addresses  both 
the  ex^eciency  and  the  E£ed i c t abi± i t of  behavior  under  the 
experimental  treatments. 

Some  vagueness  was  removed  regarding  the  size  of  the 
particular  programming  task  by  making  explicit  the  implicit 
restriction  that  completion  of  the  task  not  be  beyond  the 
capability  of  a single  programmer  working  alone  for  a reasonable 
period  of  time.  Additionally*  a large  set  of  programming  aspects 
were  specified;  they  are  discussed  in  Section  II*  Specifics.  For 
each  programming  aspect  there  were  similar  questions  of  interest# 
similar  research  hypotheses  and  similar  experiments  conducted  in 
para l le l . 

The  schema  for  the  research  hypotheses  may  be  stated  as  "In 
the  context  of  a one-person  do-able  software  development  project# 
there  < is  not  f is  > a difference  in  the  < location  I 
dispersion  > of  the  measurements  on  programming  aspect  < X > 
between  individuals  (AD*  ad  hoc  teams  (AT)*  and  disciplined  teams 
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(DT).”  For  each  programming  aspect  'X'  in  the  set  under 
consideration,  this  schema  generates  two  pairs  of  nond i re c t i ora  l 
research  hypotheses,  depending  upon  the  selection  of  'is  not'  or 
'is'  corresponcing  to  the  null  and  alternative  hypotheses,  and  the 
selection  of  'location'  or  'dispersion'  corresponding  to  the  type 
of  difference. 

Step  3:  Siaiisiijai  Mgdfi 

The  choice  of  a statistical  model  makes  explicit  various 
assumptions  regarding  the  experimental  design,  such  as  the 
dependent  variables  observed,  the  di s t r ibut  ions  of  the  underlying 
populations,  etc.  Because  the  study  involves  a 
horoyeneity-of-populations  problem  with  shift  and  spread 
alternatives,  the  multi-sample  model  used  here  requires  the 
following  criteria:  inaependent  populations,  independent  and 
ransom  sampling  within  each  population,  and  interval  scale  of 
measurement  CConover  71;  pp.  65-673  for  each  programming  aspect. 
Although  random  sampling  was  not  explicitly  achieved  in  this  study 
by  rigorous  sampling  procedures,  it  was  nonetheless  assumed  on  the 
basis  of  the  apparent  representativeness  of  the  subject  pool  and 
the  lack  of  obvious  reasons  to  doubt  otherwise.  Due  to  the  small 
sample  sizes,  the  unknown  shape  of  the  underlying  distributions, 
anc  the  partially  exploratory  nature  of  the  stuay,  a nonparamet  ric 
statistical  model  was  used. 

Whenever  statistics  is  employed  to  "prove"  that  some 
systematic  effect  --in  tnis  case,  a difference  among  the  groups-- 
exists,  it  is  important  to  measure  the  risk  of  error.  This  is 
usually  done  by  reporting  a significance  level  o CConover  71;  p. 
793,  which  represents  the  probability  of  deciding  that  a 
systematic  effect  exists  when  in  fact  it  does  not.  In  the  model, 
the  hypothesis  testing  for  each  programming  aspect  was  regarded  as 
a separate  independent  experiment.  As  a consequence  of  this 
choice,  the  siynigicance  level  is  controlled  ana  reported 
experiraentwise  ( i .e • , on  a per  aspect  basis).  while  the 
assumption  of  independence  between  such  experiments  is  not 
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entirely  supportable*  this  procedure  is  valid  as  long  as 
conclusions  that  couple  one  or  more  of  these  programming  aspects 
are  avoided  or  properly  qualified*  In  this  study*  statements 
recording  interrelationships  among  aspects  are  made  only  within 
the  interpretations  in  Section  V* 

Step  4:  SHiisiiiii  tiXfifiltiSSSl 

The  research  hypotheses  must  be  translated  into  statistically 
tractable  form*  called  statistical  hypotheses*  A correspondence* 
goverened  by  the  statistical  model*  exists  between 
application-oriented  notions  in  the  research  hypotheses  (e*g.* 
typical  performance  of  a programming  team  under  the  disciplined 
methodology)  and  mathematical  notions  in  the  statistical 
hypotheses  (e.g.*  expected  value  of  a random  variable  defined  over 
the  population  from  which  the  disciplined  teams  are  a 
representative  sample).  Generally  speaking*  only  certain 
mathematical  statements  involving  pairs  of  populations  are 
statistically  tractable*  in  the  sense  that  standard  statistical 
procedures  are  applicable*  Statements  that  are  not  directly 
tractaDle  may  be  broken  down  into  tractable  ( sub) component s whose 
results  are  properly  recomcined  after  having  been  decided 
ind iv idua  l ly  • 

In  this  study*  the  research  hypotheses  are  concerned  with 
directional  differences  among  three  programming  environments* 

Since  the  corresponding  mathematical  statements  are  not  directly 
tractaole*  they  were  broken  down  into  the  set  of  seven  statistical 
hypotheses  pairs  shown  below*  The  hypotheses  pair 

null:  A I = AT  = DT  alternative:  — ( A I * AT  = DT) 


adcresses  the  existence  of  an  overall  difference  among  the  groups* 
However*  due  to  the  weak  nondi rect ional  alternative*  it  cannot 
indicate  which  groups  are  different  or  in  what  direction  a 
difference  lies*  Standard  statistical  practice  prescribes  that  a 
successful  test  for  overall  difference  among  three  or  more  groups 
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null: 

AI 

= AT 

a l ternat i ve : 

AI 

* 

AT 

or 

AI 

< 

AT 

or 

AT 

< 

AI 

null: 

AT 

= DT 

alternat i ve: 

AT 

t 

DT 

or 

AT 

< 

DT 

or 

DT 

< 

AT 

null: 

AI 

= DT 

a l ternat i ve : 

AI 

i 

DT 

or 

AI 

< 

DT 

or 

OT 

< 

AI 

address  the  existence  and  direction  of  pairwise  differences 
between  groups.  The  results  of  these  pairwise  comparisions  were 
used  to  explicate  the  overall  comparison.  Data  collected  for  a 
set  of  experiments  may  often  be  legitimately  reused  to  "simulate" 
other  closely  related  experiments,  by  combining  certain  samples 
together  and  ignoring  the  original  di st  inct  ion ( s)  between  them. 

It  is  meaningful,  in  the  context  of  this  study's  experimental 
design,  to  compare  any  two  groups  pooled  against  the  third  since 
(1)  A I and  AT  are  both  undisciplined,  while  DT  is  disciplined;  (2) 
AT  and  DT  are  both  teams,  and  AI  is  individuals;  and  (3>  under  the 
assumption  that  disciplined  teams  behave  like  individuals  — which 
is  part  of  the  study's  basic  premise--.  DT  and  AI  can  be  pooled 
and  compared  with  AT  acting  as  a control  group.  The  hypotheses 
pairs 


null: 

AI  ♦AT 

= DT 

a l ternat i ve : 

A I ♦ AT 

DT 

or 

A I ♦ A T 

< 

DT 

or 

DT 

< 

AI  +AT 

null: 

AT+DT 

= AI 

alternative: 

AT  + DT 

t 

AI 

o r 

AT  + DT 

< 

AI 

or 

AI 

< 

AT+DT 

null: 

AI  + DT 

= AT 

alternative: 

AI  + DT 

* 

AT 

o r 

AI  + DT 

< 

AT 

or 

AT 

< 

AI+DT 

adaress  the  existence  and  cirection  of  such  pooled  differences. 

The  results  of  these  pooled  comparisons  were  used  to  corrobate  the 
overall  and  pairwise  comparisons. 

Thus,  for  any  particular  programming  aspect,  the  research 
hypotheses  pair  corresponds  to  seven  different  pairs  (null  and 
alternative)  of  scientific  hypotheses.  The  results  of  testing 
each  set  of  seven  hypotheses  must  be  abstracted  and  organized  into 
one  statistical  conclusion  using  the  first  research  framework 
discussed  in  the  next  step. 

Step  f:  Research  £ramewg£ks 

The  research  frameworks  provide  the  necessary  organi zat i ona l 
basis  for  abstracting  and  conceptualizing  the  massive  volume  of 
statistical  hypotheses  (anc  statistical  results  that  follow)  into 
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a smaller  and  more  intellectually  manageable  set  of  conclusions* 
Three  separate  research  frameworks  have  been  chosen:  (1)  the 
framework  of  possible  overall  comparison  outcomes  for  a given 
programming  aspect*  (2)  the  framework  of  dependencies  and 
intuitive  relationships  among  the  various  programming  aspects 
considered*  and  (3)  the  framework  of  basic  suppositions  regarding 
expected  effects  of  the  experimental  treatments  on  the  comparison 
outcomes  for  the  entire  set  of  programming  aspects.  The  first 
framework  is  employed  in  the  statistical  conclusions  step  because 
it  can  be  applied  in  a statistically  tractable  manner*  while  the 
remaining  two  frameworks  are  reserved  for  employment  in  the 
research  interpretations  step  since  they  are  not  statistically 
tractable  and  involve  subjective  judgement* 


Since  a finite  set  of  three  different  programming 
environments  <AI,  AT,  and  DT)  are  being  compared*  there  exists  the 
following  finite  set  of  thirteen  possible  overall  comparison 
outccmei.-f or,  each  aspect  considered: 


A I = AT 
A I < AT 


DT 

DT 


AT  = DT  < A1 


AT  < DT 
DT  = A 1 
DT  < A I 


A 

AT 

AT 


A1  = AT  < DT 


} 

;} 

} 


AI  # AT  = DT 


AT  # DT  = A I 


DT  * AI  = AT 


y AI  * AT  A DT 


AI  < AT  < DT 

AI  < DT  < AT 

AT  < DT  < AI 

AT  < AI  < DT 

DT  < AI  < AT 

DT  < AT  < AI 

There  is  a hierarchical  lattice  of  increasing  separation  and 
directionality  among  these  possible  overall  comparison  outcomes  as 
shewn  in  Diagram  2*  These  thirteen  possible -overa  1 1 comparison 
outcomes  comprise  the  first  research  framework  and  may  be  viewed 
as  providing  a complete  "answer  space"  for  the  questions  of 
interest.  It  is  clear  that  any  consistent  set  of  two-way 
comparisons  (such  as  represented  in  the  statistical  hypotheses  or 
statistical  results)  may  be  associated  with  a unique  one  of  these 
three-way  comparisons.  This  framework  is  the  basis  for  organieing 
and  condensing  the  seven  statistical  results  into  one  statistical 
conclusion  for  each  programming  aspect  considered. 


->•  • -w 
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Diagram  2.1  Lattice  of  Poaaibla  Dlractlonal  Outcoaaa  for  Three-way  Coapariaon 


1 AI-AT-DT  ! 


, ^ -r--.  , -v---.  — . 

i AKAT-DT  AT-DT<AI  J ,AT<DT-AI  DT-AKAT)  • DT<AI«AT  AI-AT<DT  ) 


(partially 

differentiated) 


> AKAT<DT  AKDT<AT  AT<BT<AI  AKAKOT  DTCAKAT  DT<AT<AI  '» 


(coaplataly 

diffarantiatad) 


N.B.  Tba  eirelaa  indieata  which  diractional  outcoaaa  corraapond  to  tha  aaaa  nondic actional  outcome. 


Diagram  2.2  Lattice  of  Poaaibla  Nondirectlonal  Outcoaaa  for  Three-way  Coapariaon 
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Since  a large  set  of  interrelated  programming  aspects  are 
being  examined,  it  would  be  desirable  to  summarize  many  of  the 
"per  aspect"  hypotheses  and  results  into  statements  which  refer  to 
several  aspects  simultaneously*  For  example,  average  number  of 
statements  per  segment  is  one  aspect  directly  dependent  on  two 
other  aspects:  number  of  segments  and  number  of  statements.  Other 
interrelationships  are  more  intuitive,  less  tractable,  or  only 
suspected,  for  example,  the  "trade-off"  between  global  variables 
and  formal  parameters.  A simple  classification  of  the  programming 
aspects  into  groups  of  intuitively  related  aspects  at  least 
provides  a framework  for  jointly  interpreting  the  correspondi ng 
statistical  'conclusions  in  light  of  the  underlying  issues  by  which 
the  aspects  themselves  are  related.  The  programming  aspects 
considered  in  this  study  were  classified  according  to  a particular 
set  of  nine  higher-level  programming  issues  (such  as  data  variable 
organization,  for  example);  details  are  given  in  Section  V, 
Interpretive  Results.  This  second  research  framework  is  the  basis 
for  abstracting  and  interpreting  what  the  study's  findings 
indicate  about  these  higher-level  programming  issues,  as  well  as 
explicitly  mentioning  several  individual  relationships  among  the 
programming  aspects  and  their  conclusions. 

Since  the  design  of  the  experiments,  the  choice  of  treatment 
factors,  etc.,  were  at  least  partially  motivated  by  certain 
general  beliefs  regarding  software  development  (such  as 
'disciplined  methodology  reduces  software  development  costs',  for 
example),  it  should  be  possible  to  explicitly  state  what 
comparison  outcomes  among  the  experimental  treatments  were 
expected  a priori  for  which  programming  aspects.  A list  of 
preplanned  expectations  (so-called  "basic  suppositions")  for  the 
outcomes  of  each  aspect's  experiment  would  provide  a framework  *or 
evaluating  how  well  the  experimental  findings  as  a whole  support 
the  underlying  general  beliefs  (by  comparing  the  actual  outcomes 
with  the  basic  suppositions  across  all  the  programming  aspects). 
Such  a list  of  basic  suppositions  was  conceived  prior  to 
conducting  the  experiments,  and  it  constitutes  the  third  research 
framework;  details  are  given  in  Section  V , Interpretive  Results. 
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This  framework  is  the  oasis  for  interpreting  the  study's  findings 
in  terms  of  evidence  in  favor  of  the  basic  suppositions  and 
general  beliefs. 

Step  t:  iat££i!D£Di2i  2SS22D 


The  experimental  design  is  the  plan  or  setup  according  to 
which  the  experiment  is  actually  conducted  or  executed.  It  is 
basea  upon  the  statistical  model,  and  deals  with  practical  issues 
such  as  experimental  units,  treatment  factors  and  levels, 
experimental  local  control,  etc.  The  experimental  design  employed 
for  this  study  has  been  discussea  in  considerable  detail  in 
Section  II,  Specifics. 


Step  7:  Coii££te$ 


The  pertinent  data  to  carry  out  the  experimental  design  was 
collectea  and  processed  to  yield  the  information  to  which  the 
statistical  test  procedures  were  applied.  Some  details  of  this 
execution  phase  are  given  in  Section  II,  Specifics. 

Step  c:  Statistical  Test  Prgcedures 


A statistical  test  procedure  is  a decision  mechanism,  founded 
upon  general  principles  of  mathematical  probability  and 
comoinatorics  and  upon  a specific  statistical  model  (i.e., 
requiring  certain  assumptions),  which  is  used  to  convert  the 
statistical  hypotheses  together  with  the  collected  data  into  the 
statistical  results.  As  dictated  by  the  statistical  model,  the 
statistical  tests  used  in  the  study  were  nonparamet r ic  tests  of 
homogeneity  of  populations  against  shift  alternatives  for  small 
samples.  .4onpa  rame  t r i c tests  are  slightly  more  conservative  (in 
rejecting  the  null  hypothesis)  than  their  parametric  counterparts; 


nonpa ramet ric  tests  generally  use  the  ordinal  ranks  associated 
with  a linear  ordering  of  a set  of  scores,  ratner  than  the  scores 
themselves,  in  their  computational  formulas.  In  particular,  the 
standard  K ruska l-Ua  1 1 i s H-test  CSiegel  ?6;  pp.  18A-1933  ana 
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Mann-Whitney  U-test  ESiegel  56;  pp.  116-1273  were  employed  in  the 
statistical  results  step.  Ryan's  Method  of  Adjusted  Significance 
Levels  CKirk  63;  pp.  97,  495-497],  a standard  procedure  for 
controlling  the  e xperimentwi se  significance  level  when  several 
tests  are  performed  on  the  same  scores  as  one  experiment,  was  also 
employed  in  the  statistical  conclusions  step. 

The  K ruska l-Wa  1 1 i s test  is  used  in  three-sample  situations  to 
test  an  X = Y * Z null  hypothesis;  its  test  statistic  is  computed 
a s 

H = 1 2 * C (R x*Rx /nx )♦( Ry *Ry/ny )♦( Rz *Rz /nz ) 3/ 1 (n) *< n+1 ) 3 - 3*(n*1) 
where  Rx,  Ry,  and  Rz  are  the  respective  sums  of  the  ranks  for 
scores  from  the  x,  Y,  ana  2 samples;  n equals  nx*ny+nz;  and  nx, 
ny,  and  nz  are  the  respective  sample  sizes.  The  Mann-whitney  test 
is  used  in  two-sample  situations  to  test  an  X = Y null  hypothesis; 
its  test  statistic  is  computed  as 

U = mini  nx*ny  ♦ nx*(nx*1)/2  - Rx  ; ny*nx  ♦ ny* (ny* 1 ) /2  - Ry  ] 
where  Rx,  Ry,  nx,  and  ny  are  defined  as  before. 

For  every  statistical  test,  there  exists  a one-to-one 
mapping,  usually  given  in  statistical  tables,  between  the  test 
statistic  — whose  value  is  completely  determined  by  the  sample 
data  scores — and  the  critical  level.  The  critical  level  a 
CConover  71;  p.  613  is  defined  as  the  minimum  significance  level 
at  which  the  statistical  test  procedure  would  allow  the  null 
hypothesis  to  be  rejected  (in  favor  of  the  alternative)  for  the 
given  sample  data.  Thus  critical  level  represents  a concise 
stanbarized  way  to  state  the  full  result  of  any  statistical  test 
procedure.  Two-tailed  rejection  regions  are  applied  for  tests 
involving  nondirectional  alternative  hypotheses,  and  one-tailed 
rejection  regions  are  applied  for  tests  involving  directional 
alternative  hypotheses,  so  that  the  stated  critical  level  always 
pertains  directly  to  the  stated  alternative  hypothesis.  A 
decision  to  reject  the  null  hypothesis  and  accept  the  alternative 
is  mandated  if  the  critical  level  is  low  enough  to  be  tolerated; 
otherwise  a decision  to  retain  the  null  hypothesis  is  made. 


i.iWHIxSg'lgw 
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The  Ryan's  procedure  is  used  in  situations  involving  multiple 
pairwise  comparisons,  in  order  to  properly  account  for  the  fact 
that  each  pairwise  test  is  made  in  conjunction  with  the  others, 
using  the  same  sample  aata.  The  individual  critical  levels  a 
obtained  for  each  pairwise  test  in  isolation  are  adjusted  to 
proper  expe rimentwi se  critical  levels  S'  via  the  formula 
o'  = [(r*1)*k/23  * a 

where  k is  the  total  number  of  samples;  and  r is  the  number  of 
(other)  samples  whose  rank  means  fall  between  the  rank  means  of 
the  particular  pair  of  samples  being  compared*  A simple  "minima*" 
step  — taking  the  maximum  of  the  several  adjusted  pairwise 
critical  levels,  plus  the  overall  comparison  critical  level,  which 
are  all  minimum  significance  levels — completes  the  procedure, 
yielding  a single  critical  level  associated  jointly  with  the 
overall  and  pairwise  comparisons* 

These  tests  and  procedures  apply  straightforwardly  when 
differences  in  location  are  considered.  A slight  modification 
makes  them  applicable  for  differences  in  dispersion:  prior  to 
ranking,  each  score  value  is  simply  replaced  by  its  absolute 
deviation  from  the  corresponding  within-group  sample  median 
[Nemenyi  et  al.  77;  pp.  t66-'2',0D.  It  should  be  noted  that  this 
mocification  results  in  only  an  approximate  method  for  solving  a 
tough  statistical  problem,  namely,  testing  whether  one  population 
is  more  variable  than  another  [Nemenyi  et  al.  77;  pp.  279-2?31. 

The  modification  is  not  strictly  statistically  "kosher"  in  the 
general  case  (it  weakens  the  power  of  the  test  procedures  and  can 
yield  inaccurate  critical  levels  when  testing  for  dispersion 
differences),  but  every  other  available  method  also  has  serious 
limitations.  This  method  has  been  sho*n  to  possess  reasonable 
accuracy  as  long  as  the  underlying  distributions  are  fairly 
symmetrical  and  it  adapts  easily  to  the  study's  three-way 
comparison  situation. 

Step  9:  Sia*isii£ai  Rejyiis 


A statistical  result  is  essentially  a decision  reached  by 
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applying  a statistical  test  procedure  to  the  set  of  collected  and 
refined  data*  regarding  which  one  of  the  corresponding  pair  (null* 
alternative)  of  statistical  hypotheses  is  indeed  supported  by  that 
data*  For  each  pair  of  statistical  hypotheses*  there  is  one 
statistical  result  consisting  of  four  coaponents:  (1)  the  null 
hypothesis  itself;  (2)  the  alternative  hypothesis  itself;  (3)  the 
critical  level*  stated  as  a probability  value  between  0 and  1;  and 
(4)  a decision  either  to  retain  the  null  hypothesis  or  to  reject 
it  in  favor  of  (i.e.*  accept)  the  alternative  hypothesis* 

By  convention*  the  null  hypothsis  purports  that  no  systematic 
difference  appears  to  exist*  and  the  alternative  hypothesis 
purports  that  some  systematic  difference  seems  to  exist*  The 
critical  level  is  associated  with  erroneously  accepting  the 
alternative  hypothesis  (i.e**  claiming  a systematic  difference 
when  none  in  fact  exits)*  The  decision  to  retain  or  reject  is 
reached  on  the  basis  of  some  tolerable  level  of  significance*  with 
which  the  critical  tevel  is  compared  to  see  if  it  is  low  enough. 

In  cases  where  a null  hypothesis  is  rejected*  the  appropriate 
directional  alternative  hypothesis  (if  any)  is  used  to  indicate 
the  direction  of  the  systematic  difference*  as  determined  by 
direct  observation  from  the  sample  medians  in  conjunction  with  a 
one-tailed  test* 

Conventional  practice  is  to  fix  an  arbitrary  significance 
level  (e.g.*  .05  or  .01)  in  advance*  to  be  used  as  the  tolerable 
level;  critical  levels  then  serve  only  as  stepping-stones  toward 
reaching  decisions  and  are  not  reported*  for  this  partially 
exploratory  study*  it  was  deemed  more  appropriate  to  fix  a 
tolerable  level  only  for  the  purpose  of  a screening  decision 
(which  simply  purges  those  results  with  intolerably  high  critical 
levels)*  and  to  carry  the  actual  critical  level  along  with  each 
statistical  result.  This  unconventional  practice  yields 
statistical  results  in  a more  meaningful  and  flexible  form*  since 
the  significance  or  error  r*sk  of  each  result  may  be  assessed 
individually*  and  results  at  other  more  stringent  significance 
levels  may  be  easily  determined.  Furthermore,  the  necessary 
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information  is  retained  for  properly  recombining  multiple  related 
results  on  an  experimentwise  basis  in  the  statistical  conclusions 
ste^> 


Tne  tolerable  level  of  significance  used  throughout  this 
stujy  to  sceen  critical  levels  was  fixed  at  under  .20.  Although 
fairly  high  for  a confirmatory  study*  it  is  reasonable  for  a 
partially  exploratory  study,  such  as  this  one*  seeking  to  discover 
even  slight  trends  in  the  oata.  A critical  level  of  .20  means 
that  the  odds  of  obtaining  test  scores  exhibiting  the  same  degree 
of  difference*  due  to  random  chance  fluctuations  alone*  are  one  in 
five. 


As  an  example*  the  seven  statistical  results  for  location 

comparisons  on  the  programming  aspect  statement  TYPE  COUNTSNIF  are 

shown  below.  (N.3.  The  asterisks  will  be  explained  in  Step  1C.) 

(sc  reening) 
decision 


null 

alternative 

critical 

hypot  he  s i s 

h^got hgs  i s 

l£vei 

Ai  - AT  - DT 

-(  Ai  - AT  - DT) 

.0630 

A I - AT 

AI  < AT 

.0465 

AI  = DT 

AI  t DT 

>.9999 

AT  = DT 

DT  < AT 

.0111 

A I ♦ A T = DT 

DT  < A 1 ♦ AT 

.0834 

AI*DT  - AT 

AI ♦DT  < AT 

.0089 

AT  + DT  = AI 

AT+DT  t AI 

.3352 

reject 
reject 
retain 
reject 
reject  * 
reject 
retain  * 

Observe  that  the  stated  decisions  simply  reflect  the  application 
of  the  .20  tolerable  level  to  the  stated  critical  levels.  Results 
under  more  stringent  levels  of  significance  can  be  easily 
determined  by  simply  applying  a lower  tolerable  level  to  form  the 
decisions;  e.g.,  at  the  .C5  significance  level*  only  the  A1  < AT, 
DT  < AT*  and  AI+DT  < AT  alternative  hypotheses  would  be  accepted; 
only  the  A I ♦ D T < AT  hypothesis  would  be  accepted  at  the  .01  level. 


Sttp  1C:  Stii2£ii£al  CfiOiiySiSDS 


The  volume  of  statistical  results  are  organ i ted  and  condensed 
into  statistical  conclusions  according  to  the  prearranged  research 
f rame work (s) • A statistical  concluson  is  an  abstraction  of 
several  statistical  results*  but  it  retains  the  same  statistical 
character*  having  been  derived  via  statistically  tractable  methods 
anc  possessing  an  associated  critical  level. 
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Specifically,  the  first  research  framework  mentioned  above 
was  employed  to  reduce  the  seven  statistical  results  (with  seven 
individual  critical  levels)  for  each  programming  aspect  to  a 
single  statistical  conclusion  (with  one  overall  critical  level) 
for  that  aspect.  The  statement  portion  of  a statistical 
conclusion  is  simply  one  of  the  thirteen  possible  overall 
comparison  outcomes.  Each  overall  comparison  outcome  is 
associated  with  a particular  set  of  statistical  results  whose 
outcomes  support  the  overall  comparison  outcome  in  a natural  way. 
For  example,  the  DT  = AI  < AT  conclusion  is  associated  with  the 
following  results: 

reject  A I = AT  = DT  in  favor  of  -(AI  = AT  s dT), 

reject  AI  = AT  in  favor  of  AI  < AT, 

retain  AI  = DT, 

reject  AT  “ DT  in  favor  of  DT  < AT,  and 

reject  AI+DT  = AT  in  favor  of  AI  + DT  < AT. 

Since  the  other  two  comparisons  (AI+AT  versus  AT,  AT+DT  versus  AI) 
are  in  a sense  orthogonal  to  the  overall  comparison  outcome 
(DT  = A I < AT),  their  results  are  considered  irrelevant  to  this 
conclusion.  The  chart  in  Diagram  3 shows  exactly  which  results 
are  associated  with  each  conclusion:  the  relevant  comparisons,  the 
null  hypotheses  to  be  retained,  and  the  alternative  hypotheses  to 
be  accepted.  The  ether  portion  of  a statistical  conclusion  is  the 
critical  level  associated  with  erroneously  accepting  the  statement 
portion.  It  is  computed  from  the  individual  critical  levels  of 
certain  germane  results. 

A simple  deterministic  algorithm,  based  on  the  chart  in 
Diagram  3,  was  used  to  generate  the  statistical  conclusions  (and 
compute  the  overall  critical  level)  automatically  from  the 
statistical  results.  For  each  programming  aspect,  the  algorithm 
compared  the  actual  results  obtained  for  the  seven  statistical 
hypotheses  pairs  with  the  results  associated  with  each  conclusion, 
searching  for  a match.  Ryan's  procedure  was  used  to  properly 
combine  the  individual  critical  levels  for  the  overall  result  and 
the  relevant  pairwise  results,  by  adjusting  them  via  the  formula 
ano  then  taking  their  maximum.  The  critical  levels  for  the 


D iuyroit  3.  Association  Chart  fgr  Results  and  Conclusion 
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(AI=AT=DT)  A1  <AT  D T < A I DT<  AT  DT<AI*AT  AI  + DKAT 
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relevant  pooled  results  were  factored  in  via  a simple  formula 
based  on  the  multiplicative  rule  for  the  joint  probability  of 
independent  events. 

Continuing  the  example  started  in  Step  9,  the  statistical 
results  shown  there  for  location  comparisons  on  the  STATEMENT  TYPE 
COUNTS \ I F aspect  are  reduced  to  the  statistical  conclusion 
DT  = AI  < AT  with  .0780  critic aU  level  overall.  The  five  results 
not  marked  with  an  asterisk  in  Step  9 match  the  five  results 
associated  above  with  the  DT  = Al  .<  AT  outcome.  (Note  that  the 
other  two  marked  results  represent  comparisons  which  are 
irrelevant  to  this  conclusion.)  The  .0465  and  .0111  critical 
levels  for  the  two  pairwise  differences  are  adjusted  to  .0697  and 
•0332*  and  the  maxim um  of  those  adjusted  values  plus  the  .0630 
overall  difference  critical  level  is  .0697.  The  relevant  pooled 
comparison  critical  level  of  .0089  is  factored  in  by  taking  the 
complement  of  the  products  of  the  complements: 

1 - C<1  - .0697) * ( 1 - .0089)J = ..0780 

Thus . the  statistical  conclusions  are  in  one-to-one 
correspondence  with  the  research  hypotheses  and  provide  concise 
answers  on  a "per  aspect"  basis  to  the  questions  of  interest. 
Further  details  and  complete  listing  of  the  statistical 
conclusions  for  this  study  are  presented  below  in  Section  IV. 

Step  11:  3£S£2££fc  lQl££C££l2llfi£i 

The  final  step  in  the  approach  is  to  interpret  the 
statistical  conclusions  in  view  of  any  remaining  research 
f ramework ( s) . the  researchers'  Intuitive  and  professional 
expectations,  and  the  work  of  other  researchers.  These  research 
interpretations  provide  the  opportunity  to  augment  the  objective 
findings  of  the  study  with  the  researcher's  own  subjective 
judgments  ano  interpretat ions.  The  second  and  third  research 
frameworks  mentioned  above  --namely,  the  intuitive  relationships 
among  the  various  programming  aspects  and  the  basic  suppositions 
governing  their  expected  outcomes--  were  considered  important. 
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However  these  particular  research  frameworks  can  only  oe  utilized 
for  the  research  interpretat ions t since  they  are  not  amenable  to 
rigorous  mani pula t i on . Nonetheless*  within  these  frameworks  which 
ure  based  upon  intuitive  understanding  about  the  programming 
aspects  ana  software  development  environments  under  consideration* 
the  study  bears  some  of  its  most  interesting  results  and 
implications.  Complete  details  and  discussion  of  the  research 
interpretati ons  of  this  study  appear  in  Section  V. 
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IV.  fibi££iiyt  Sssyii5 

1 

This  section  reports  the  objective  results  of  the  study* 
namely*  the  statistical  conclusions  for  each  programming  aspect 
considered.  The  tone  of  discussion  here  is  purposely  somewhat 
disinterested  ana  analytical*  in  keeping  with  the  empirical  and 
statistical  character  of  these  conclusions.  All  Interpretive 
discussion  is  deferred  to  Section  V. 

Each  statistical  conclusion  is  expressed  in  the  concise  form 
of  a three-way  comparison  outcome  "equation."  It  states  any 
observed  differences*  and  the  directions  thereof*  among  the 
programming  environments  represented  by  the  three  groups  examined 
in  the  study:  ad  hoc  individuals  (AI)*  ad  hoc  teams  (AT)*  and 
disciplined  teams  (DT).  The  equality  AI  * AT  = DT  expresses  the 
null  conclusion  that  there  is  no  systematic  difference  among  the 
groups.  An  inequality*  e.g.*  AI  < AT  * DT  or,  DT  < AI  < AT, 
expresses  a non-null  (or  alternative)  conclusion  that  there  are 
certain  systematic  di f f erence  (s)  among  the  groups  in  stated 
direction(s) . A critical  level  (or  risk)  value  is  also  associated 
with  each  non-null  (or  alternative)  conclusion*  indicating  its 
individual  reliability.  This  value  is  the  probability  of  having 
erroneously  rejected  the  null  conclusion  in  favor  of  the 
alternative;  it  also  provides  a relative  index  of  how  pronounced 
the  differences  were  in  the  sample  data. 

The  remainder  of  this  section  consists  of  (a)  presenting  the 
full  set  of  conclusions*  (b)  evaluating  their  impact  as  a whole* 

(c)  exposing  a "relaxed  differentiation"  view  of  the  conclusions* 

(d)  exposing  a "directionless"  view  of  the  conclusions*  and  (e) 
individually  highlighting  a few  of  the  more  noteworthy 
conclusions. 

P££i£QI2iiflQ 


Instances  of  non-null 


(or  alternative)  conclusions  Indicating 
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some  distinction  among  the  groups  on  the  basis  of  a particular 
programming  aspect  are  listed  by  outcome  in  Tables  2.1  (for 
location  comparisons)  and  2.2  (for  dispersion  comparisons).  A 
complete  itemization  of  these  distinctions*  in  English  prose  form, 
appears  in  Appendix  2.  The  complete  set  of  statistical 
conclusions  for  ooth  location  and  dispersion  comparisons  appears 
in  Table  3 arranged  oy  programming  aspect. 

Examination  of  Table  3 immediately  demonstrates  that  a large 
nunaer  of  the  programming  aspects  considered  in  this  study, 
especially  product  aspects,  failed  to  show  any  distinction  between 
the  groups.  This  lo*  "yield"  is  not  surprising,  especially  among 
product  aspects,  and  may  be  attributed  to  the  partially 
exploratory  nature  of  the  study,  the  small  sample  sizes,  and  the 
general  coarseness  of  many  of  the  aspects  considered.  The  issue 
of  these  null  outcome  occurrences  and  their  significance  is 
treated  more  thoroughly  in  the  next  subsection.  Impact  Evaluation. 

It  is  worth  noting,  however,  that  several  of  the  null 
conclusions  may  indicate  characteristics  inherent  to  the 
application  itself.  As  one  example,  the  basic  sy mbo l -t ab l e /sea n/ 
parse/code-generation  nature  of  a compiler  strongly  influences  the 
way  the  system  is  modularized  and  thus  practically  determines  the 
nuiroer  of  modules  in  the  final  product  (give  or  take  some 
occassional  slight  variation  due  to  other  design  decisions). 

Evaluation 

These  statistical  conclusions  have  a certain  objective 
character  — since  they  are  st at i s t i ca 1 1 y inferred  from  empirical 
data--  and  their  collective  impact  may  be  objectively  evaluated 
according  to  the  following  statistical  principle  CTukey  69;  p. 
S4-S5j.  Whenever  a series  of  statistical  tests  (or  experiments) 
are  made,  all  at  a fixed  level  of  significance  (for  example,  .ID, 
a corresponding  percentage  (in  the  example,  IPS)  of  the  tests  are 
expected  a priori  to  reject  the  null  hypothesis  in  the  complete 
absence  of  any  true  effect  (i.e.,  due  to  chance  alone).  This 
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Table  3.  Statistical  Conclusions 

N.B.  A simple  pair  of  equal  signs  ( ■ « ) appears  in  place  of  the  null 

outcome  AI  - AT  ■ DT  in  order  to  avoid  cluttering  the  table  excessively. 
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a 

a 

X 

1 

a 

a 

MODIFIED 

a 

a 

1 

1 

a 

a 

X 

UNMODIFIED 

a 

a 

X 

1 

a 

a 

X 

NONGLOBAL 

a 

a 

X 

1 

AI 

a 

AT 

< 

DT 

X 

0.0750 

PARAMETER 

AI 

< 

AT 

a 

DT 

X 

0.1507 

1 

AI 

a 

AT 

< 

DT 

X 

0.0557 

VALUE 

a 

a 

X 

1 

AI 

■ 

AT 

< 

DT 

X 

0.0943 

REFERENCE 

a 

a 

X 

1 

AI 

< 

AT 

a 

DT 

X 

0.1529 

LOCAL 

AT 

a 

DT 

< 

AI 

X 

0.1090 

1 

a 

a 

1 AVERAGE  GLOBAL  VARIABLES  PER 

MODULE 

m 

a 

X 

1 

a 

a 

X 

ENTRY 

a 

a 

X 

1 

a 

a 

X 

I NONENTRY 

a 

a 

X 

1 

a 

a 

X 

MODIFIED 

a 

a 

X 

1 

DT 

a 

AI 

< 

AT 

X 

0.1100 

UNMODIFIED 

a 

a 

X 

1 

a 

a 

X 

1 AVERAGE  NONGLOBAL  VARIABLES  PER  SEGMENT 

a 

a 

X 

1 

a 

a 

X 

PARAMETER 

AI 

< 

AT 

a 

DT 

X 

0.1748 

1 

a 

a 

X 

i LOCAL 

a 

a 

X 

1 

a 

a 

X 

■ ■■1 

■at 

IBBI 

■ ai 

■aaa 

■ s 

1 

aaai 

BBI 

laai 

■a 

■aaa 

■ s 

| 
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PARAMETER  PASSAGE  TYPE  PERCENTAGES  i 
VALUE 

REPERENCE 

1 

1 

1 

a 

m 

m 

a 

: 

i 

X 

AI 

AI 

< 

< 

AT 

AT 

m 

B 

DT 

DT 

t 

> 0.1606 
s 0.1606 

(SEG, GLOBAL)  ACTUAL  USAGE  PAIRS 

1 

a 

a 

X 

m 

m 

: 

ENTRY 

1 

m 

m 

X 

m 

m 

: 

MOPIPIED 

1 

m 

m 

X 

m 

• 

: 

UNMODIFIED 

1 

m 

a 

l 

m 

■ 

: 

NONENTRY 

1 

m 

a 

X 

m 

B 

: 

MODIFIED 

1 

a 

m 

X 

m 

B 

UNMODIFIED 

1 

■ 

m 

X 

■ 

B 

MODIFIED 

1 

■ 

m 

X 

AT 

< 

DT 

B 

AI 

: 0.1061 

UNMODIFIED 

1 

m 

m 

: 

■ 

B 

(SEG, GLOBAL)  POSSIBLE  OSAGE 

PAIRS 

1 

AI 

< 

AT 

m 

DT 

: 

0.1227 

AI 

< 

DT 

< 

AT 

: 0.0523 

ENTRY 

1 

m 

m 

m 

B 

MODIFIED 

1 

m 

m 

m 

B 

UNMODIFIED 

1 

a 

m 

: 

m 

B 

NONENTRY 

1 

m 

m 

AI 

< 

AT 

B 

DT 

: 0.0786 

MODIFIED 

1 

a 

m 

DT 

B 

AI 

< 

AT 

: 0.0510 

UNMODIFIED 

1 

m 

m 

: 

AI 

< 

DT 

< 

AT 

s 0.1727 

MODIFIED 

1 

■ 

m 

m 

m 

UNMODIFIED 

1 

■ 

m 

s 

* 

B 

(SEG, GLOBAL)  USAGE  RELATIVE 

PERCENTAGES 

1 

■ 

m 

: 

B 

B 

ENTRY 

1 

AT 

< 

DT 

< 

AI 

0.1173 

B 

m 

MODIFIED 

1 

AT 

< 

DT 

< 

AI 

: 

0.1232 

m 

■ 

UNMODIFIED 

1 

a 

m 

: 

B 

B 

: 

NONENTRY 

1 

m 

m 

: 

• 

B 

: 

MODIFIED 

1 

m 

m 

: 

m 

a 

: 

UNMODIFIED 

1 

AT 

< 

DT 

m 

AI 

: 

0.1546 

B 

a 

MODIFIED 

1 

■ 

m 

: 

B 

a 

UNMODIFIED 

1 

• 

m 

s 

■ 

a 

(SEG, GLOBAL, SEG)  DATA  BINDINGS  t 

1 

: 

: 

ACTUAL 

1 

m 

m 

s 

B 

a 

SUB FUNCTIONAL 

1 

m 

m 

• 

a 

INDEPENDENT 

1 

m 

m 

AI 

< 

AT 

a 

DT 

: 0.1963 

POSSIBLE 

1 

DT 

m 

AI 

< 

AT 

0.1B61 

DT 

B 

AI 

< 

AT 

: 0.1529 

RELATIVE  PERCENTAGE 

1 

■ 

m 

: 

B 

» 
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expected  rejection  percentage  provides  a comparative  index  of  the 
true  impact  of  the  test  results  as  a whole  (in  the  example,  a 2 5% 
actual  rejection  percentage  would  indicate  that  a truely 
significant  effect,  other  than  chance  alone,  was  operative). 


The  point  here  may  be  illustrated  in  terms  of  simple 
coin-tossing  experiments.  The  nature  of  statistics  itself 
dictates  that,  out  of  a series  of  100  separate  statistical  tests 
of  a hypothetically  fair  coin  at  the  .05  significance  level, 
roughly  5 of  those  tests  would  nonetheless  indicate  that  the  coin 
was  biased;  if  only  6 out  of  ICO  tests  of  a real  coin  indicate 
Dias  at  the  .05  level,  those  six  results  have  very  little  impact 
since  the  coin  is  behaving  rather  unbiasedly  over  the  full  set  of 
tests. 


This  same  "multiplicity"  principle  applies  to  the  statistical 
conclusions  of  the  study,  since  they  represent  the  outcorr.es  of  a 
series  of  separate  tests  and  were  assumed  in  the  statistical  model 
to  oe  separate  experiments.  It  is  appropriate  to  evaluate  the 
location  and  dispersion  results  separately,  since  they  reflect  two 
separate  issues  (expectency  and  predictability)  of  software 
development  behavior.  Similarly,  it  is  also  appropriate  to 
evaluate  the  process  and  product  results  separately.  Finally,  it 
is  only  fair  to  evaluate  the  "confirmatory"  aspects  as  a distinct 
sutset  of  all  aspects  examined,  since  they  alone  had  been  honestly 
considered  prior  to  collecting  and  analyzing  the  data. 

The  details  of  this  impact  evaluation  for  the  study's 
objective  results,  broken  down  into  the  appropriate  categories 
identified  above,  are  presentee:  in  the  following  table.  The 
evaluation  was  performed  at  the  o=.2C  significance  level  used  for 
screening  purposes,  hence  the  expected  rejection  percentage  fer 
any  category  was  20%.  For  each  category  of  aspects,  the  table 
bives  the  number  of  (nonreoundant)  programming  aspects,  the 
expected  (rounded  to  whole  numoers)  and  'actual  numbers  cl 
rejections  (of  the  null  conclusion  in  favor  of  a directional 
alternative),  anc  the  expected  and  actual  rejection  percentages. 
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An  asterisk  marks  those  categories  demonstrating  noticable 
statistical  impact  <i*e.,  actual  rejection  percentage  well  above 
expected  rejection  percentage)* 


category 

number 

of 

aspects 

expect • 
num.  of 
reject* 

actual 
num*  of 
reject . 

expect, 
reject . 
percent  1 

actual 
reject  . 
percent 

locat ion 

130 

26 

32 

20.0 

24.6 

process 

10 

2 

0 

20.0 

90.0 

confirmatory  only 

6 

1 

6 

20.0 

100.0 

product 

120 

24 

2 7 

20.0 

19.2 

confirmatory  only 

29 

6 

12 

20.0 

41  .3 

confimatory  only 

35 

7 

1? 

20.0 

51  .4 

dispersion 

130 

26 

32 

20.0 

24.6 

process 

10 

2 

2 

20.0 

2C  .0 

confirmatory  only 

6 

1 

n 

20.0 

C.O 

product 

120 

24 

x n 

20.0 

25  .0 

confirmatory  only 

29 

6 

e 

20.0 

31.0 

confirmatory  only 

35 

7 

0 

20.0 

25.7 

— — ♦ > ♦ ♦ — — + ♦ 

The  table  shows  that  the  location  results,  dealing  with  the 
expectency  of  software  development  behavior*  do  have  statistical 
impact  in  several  subcategories*  Process  aspects  have  more  impact 
than  product  aspects  on  the  whole*  but  when  tempered  by 
consideration  of  the  distinction  between  "confirmatory"  and 
"exploratory"  aspects*  the  study's  location  reults  bear  strong 
statistical  impact  for  both  process  and  product*  They  are  better 
explained  as  the  consequence  of  some  true  effect  related  to  the 
experimental  treatments*  rather  than  as  a random  phenomenon. 

It  is  also  clear  from  the  table  that  the  dispersion  results* 
dealing  with  the  predictability  of  software  development  behavior* 
have  little  statistical  impact  in  general*  This  is  due  primarily 
to  the  diminished  power  of  statistical  procedures  used  to  test  'or 
dispersion  differences*  compounded  by  the  small  sample  sizes 
involved  and  the  coarseness  of  many  of  the  programming  aspects 
themselves*  The  lack  of  strong  statistical  impact  in  this  area  of 
the  study  does  not  mean  that  the  dispersion  issue  is  unimportant 
or  undeserving  of  research  attention*  but  rather  that  it  is  "a 
tougher  nut  to  crack"  than  the  location  issue*  The  study's 
dispersion  results  are  still  worth  persuing*  however*  as  possible 
hints  of  where  differences  might  exist*  provided  this  disclaimer 
regarding  their  impact  is  heeded. 
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As  described  in  Section  1 1 J • the  research  framework  of 
possible  three-way  comparison  outcomes  provided  the  basis  for 
converting  the  statistical  results  into  the  statistical 
conclusions*  This  framework  has  two  inherent  structural 
characteristics  that  may  be  exploited  to  make  additional 
observations  regarding  the  statistical  conclusions*  These 
structural  characteristics  and  the  supplemental  views  of  the 
conclusions  that  they  afford  are  described  here  and  in  the  next 
subsection. 

Specifically*  the  first  structural  character  ist  ic  is  that 
each  completely  differentiated  outcome  is  related  to  a specific 
pair  of  partially  differentiated  outcomes*  as  shown  in  the  lattice 
of  Diagram  2*1*  For  example*  AI  < AT  < dt,  a completely 
differentiated  outcome*  naturally  weakens  to  either  AI  < AT  = DT 
or  AI  = AT  < DT,  two  partially  differentiated  outcomes. 

Each  completely  differentiated  outcome  consists  of  three 
pairwise  differences  (AI  < AT,  AT  < DT,  AI  < DT  in  the  example), 
while  each  partially  differentiated  outcome  consists  of  only  two 
pairwise  differences  plus  one  pairwise  equality  (AI  < DT,  AI  < at, 
AT  = DT  and  AI  < DT * AT  < DT*  AI  = AT  in  the  example)*  The 
"outer"  difference  of  the  completely  differentiated  outcome 
(AI  < DT  in  the  example)  is  common  to  both  partially 
differentiated  outcomes*  while  each  partially  differentiated 
outcome  focuses  attention  on  one  of  the  two  "inner"  differences 
(Al  < AT  and  AT  < DT  in  the  example)  to  the  exclusion  of  the  other 
"inner"  aifference  which  is  "relaxed"  to  an  equality.  Within  a 
statistical  environment  or  model  which  places  a premium  on 
claiming  differences  instead  of  equalities,  a partially 
differentiated  outcome  is  a safer  statement*  containing  less 
error-prone  information  than  a completely  differentiated  outcome. 
Since  these  outcome s represent  statistical  conclusions,  the  same 
data  scores  which  support  a completely  differentiated  outcome  at  a 
certain  critical  level  also  support  each  of  the  two  related 
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partially  a i f f e rent i a ted  outcomes  at  lower  critical  levels* 

Thus*  every  completely  differentiated  conclusion  may  also  be 
considered  as  two  (more  significant)  partially  differentiated 
conclusions*  each  of  these  three  conclusions  having  equal  and 
complete  statistical  legitimacy*  The  "outer"  difference  of  a 
completely  differentiated  conclusion  is*  of  course*  stronger  than 
either  of  its  two  "inner"  differences;  but  the  strengths  of  the 
two  "inner"  differences  (relative  to  each  other)  will  vary  in 
accordance  with  the  data  scores  anq  indeed  are  reflected  in  the 
significance  levels  of  the  two  corresponding  partially 
differentiated  conclusions  (relative  to  each  other).  Tables  4.1 
and  4.2  give  the  details  of  this  "relaxed  differentiation" 
analysis  for  each  of  the  completely  differentiated  conclusions 
found  in  the  study*  and  an  English  paraphrase  appears  in  Appendix 
5*  All  of  the  partially  di f f e rent ia t ed  conclusions  listed  in 
these  taoles  should  be  added  to  those  presented  in  Tables  2 and  3; 
they  deserve  full  consideration  in  any  analysis  or  interpretation 
of  the  study's  findings*  However*  in  the  case  that  one  of  a 
partially  differentiated  pair  is  noticeably  stronger  than  the 
other,  it  is  fair  to  consider  only  the  stronger  one  for  the 
purpose  of  analysis  or  interpretat ion  dealing  primarily  with 
partially  differentiated  outcomes*  since  the  study  is  mainly 
concerned  with  the  most  pronounced  difference  affordeo  by  each 
aspect's  data  scores* 

ft  Eicsiiiaolcss  Xi&u 

The  second  structural  characteristic  of  the  possible  outcome 
framework  is  that  the  outcomes  may  be  classified  into  another 
closely  related  set  of  directionless  outcomes*  as  shown  in  the 
lattice  of  Diagram  2.2*  For  example*  AI  < AT  s DT  and 
AT  = DT  < AI,  two  directional  partially  differentiated  outcomes* 
both  correspond  to  AI  # AT  = DT*  a nondirect iona  l partially 
differentiated  outcome.  All  six  of  the  directional  completely 
differentiated  outcomes  correspond  to  the  single  nond i rec t i onal 
completely  differentiated  outcome  AI  i AT  t DT. 
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Table  4.1  Relaxed  Differentiation  for  Location  Comparisons 


programming  aspect 


completely 

differentiated 

conclusion 


1 comparison 

s 

critical 

1 

comparison 

tcritical 1 

1 

outcome 

r* 

level 

1 

* 

outcome 

X 

1*1 

level  1 

PROGRAM  CHARGES 

1 

1 DT 

< 

AI 

< AT 

S 

X 

0.1848 

1 

1 

DT 

< 

AI 

■ 

AT 

X 

X 

1 

0.0037  I 

1 

X 

1 

DT 

■ 

AI 

< 

AT 

X 

0.1846  1 

1 

LINES 

1 

1 AX 

< 

DT 

< AT 

: 

0.1194 

1 

1 

DT 

m 

AI 

< 

AT 

X 

0.0617  I 

i 

* 

1 

AI 

< 

AT 

m 

DT 

X 

0.1132  1 

(SEG, GLOBAL)  USAGE 

RELATIVE 

1 

I AT 

< 

DT 

< AI 

: 

0.1173 

1 

1 

AT 

< 

DT 

m 

AI 

X 

0.0826  1 

PERCENTAGES  \ ENTRY 

i 

1 

X 

1 

1 

AT 

■ 

DT 

< 

AI 

0.1111  1 
j 

(SEG, GLOBAL)  USAGE 

RELATIVE 

1 AT 

< 

DT 

< AI 

X 

0.1232 

1 

AT 

< 

DT 

■ 

AI 

0.1132  I 

PERCENTAGES  \ ENTRY 

\ MODIFIED 

1 

X 

1 

AT 

■ 

DT 

< 

AI 

0.1132  1 

partially  i 

differentiated  I 

conclusions  I 

I 


Table  4.2  Relaxed  Differentiation  for  Dispersion  Comparisons 


programming  aspect 


completely 

differentiated 

conclusion 

comparison  scritical 
outcome  t level 


MAX  UNIQUE  COMPILATIONS  F.A.O.  MODULE 

1 DT 

< 

AI 

< 

AT 

X 

0.0514 

1 

DT 

< 

AI 

m 

AT 

0.0036 

1 

1 

1 

DT 

m 

AI 

< 

AT 

0.0511 

STATEMENT  TYPE  COUNTS  \ RETURN 

1 

1 DT 

< 

AI 

< 

AT 

X 

0.1398 

1 

1 

DT 

m 

AI 

< 

AT 

X 

0.0035 

1 

X 

1 

1 

DT 

< 

AI 

• 

AT 

X 

0.1395 

(SEG, GLOBAL)  POSSIBLE  USAGE  PAIRS 

1 

1 AI 

< 

DT 

< 

AT 

X 

0.0523 

1 

AI 

< 

AT 

■ 

DT 

0.0207 

1 

I 

X 

1 

1 

DT 

m 

AI 

< 

AT 

X 

. 

0.0511 

(SEG, GLOBAL)  POSSIBLE  USAGE 

1 

1 AI 

< 

DT 

< 

AT 

X 

X 

0.1727 

1 

1 

AI 

< 

AT 

■ 

DT 

. 

X 

0.1167 

PAIRS  \ NONENTRY  \ UNMODIFIED 

1 

X 

1 

DT 

m 

AI 

< 

AT 

X 

0.1561 

partially 

differentiated 

conclusions 

comparison  tcritical 
outcome  : level 
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By  emphasizing  only  the  observed  distinctions  between  the 
groups,  these  directionless  outcome  categories  focus  attention  on 
the  original  research  issue  of  how  observable  programming  aspects 
reflect  Differences  among  the  three  programmin,  environments*  In 
particular,  there  are  three  nondi rect ional  partially 
differentiated  outcomes  (each  of  the  form  “one  group  different 
from  the  other  two  which  are  similar"),  and  it  is  noteworthy  to 
observe  just  what  set  of  programming  aspects  supports  each  of 
these  oasic  distinctions.  It  is  fairly  easy  to  coalesce  the 
directional  distinctions  from  Table  ? into  the  directionless 
categories  by  eye,  but  a complete  itemization  of  directionless 
distinctions  is  provided  in  Appendix  4.  it  is  interesting  to  note 
that,  for  location  comparisons,  the  directionless  distinctions 
segregate  cleanly  along  the  process  versus  product  dicctomy  line: 
all  of  the  product  distinctions  fall  into  the  AI  4 AT  = OT  and 
AT  4 o T = AI  directionless  categories,  while  all  of  the  process 
distinction  fall  into  the  DT  t AI  = AT  directionless  category. 


lD2i*igydl  ttisblisbii 

The  purpose  of  this  concluding  subsection  is  simply  to  draw 
attention  to  what  seem  to  be  the  "top  ten"  (or  so)  most  noteworthy 
conclusions  from  among  the  study's  objective  results.  These 
conclusions  are  interesting  individually,  either  because  the 
programming  aspect  itself  has  general  appeal  or  oecause  the 
difference  in  behavior  expectency  or  predictability  is  well 
pronounced  (as  indicated  by  a low  critical  significance  level)  in 
the  experimental  sample  data. 


Noteworthy  igcajigr  distinctions  are  mentioned  below. 

According  to  the  DT  < AI  = AT  outcome  on  the  COMPUTER  JOE 
STEPS  aspect,  the  disciplined  teams  used  very  noticeably 
fewer  computer  job  steps  (i.e.,  module  compilations,  program 
executions,  or  miscellaneous  job  steps)  than  both  the  ad  hoc 
individuals  and  the  ad  hoc  teams. 

This  same  d i f f e renc e . wa s apparent  in  the  total  number  o* 
module  compilations,  the  number  of  unique  (i.e.,  not  an 
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identical  recompilation  of  a previously  compiled  module) 
module  compi lations*  the  number  of  program  executions*  ard 
the  number  of  essential  job  steps  (i.e.,  unique  module 
compilations  plus  program  executions)*  according  to  the 
D T < A I = AT  outcomes  on  the  COMPUTER  JOB  STEPSXMODULE 
COMPILATIONS*  COMPUTER  JOB  STEPSXMODULE  COMPJLA TI ONS \ UNI QU E * 
COMPUTER  JOB  ST  = PS \PROGR AM  EXECUTIONS,  and  ESSENTIAL  JOe 
STEPS  aspects*  respectively. 

2.  According  to  the  DT  < AI  * AT  outcome  on  the  PROGRAM  CHANGES 

aspect*  the  disciplined  teams  required  fe»er  textual 
revisions  to  build  and  debug  the  software  than  the  ad  hoc 
individuals  and  the  ad  hoc  teams. 

h.  There  was  a definite  trend  for  the  ad  hoc  individuals  to  have 
produced  fewer  total  symbolic  lines  (includes  comments* 
compiler  directives*  statements*  dec larat ions  * etc.)  than  the 
disciplined  teams  who  produced  fewer  than  the  ad  hoc  teams* 
according  to  the  AI  < DT  < AT  outcome  on  the  LINES  aspect. 

. According  to  the  AI  < AT  - DT  outcome  on  the  SEGMENTS  aspect* 
the  ad  hoc  individuals  organized  their  software  into 
noticeably  fewer  routines  (i.e.*  functions  or  procedures) 
than  either  the  ad  hoc  teams  or  the  disciplined  teams. 

6.  The  ad  hoc  individuals  displayed  a trend  toward  having  a 

greater  number  of  statements  per  routine  than  did  either  the 
ad  hoc  teams  or  the  disciplined  teams*  according  to  the 
AT  = DT  < AI  outcome  on  the  AVERAGE  STATEMENTS  PER  SEGMENT 
aspect. 

7.  According  to  the  DT  = AI  < AT  outcomes  on  the  STATEMENT  TYPE 

COUNTSMF  and  STATEMENT  TYPE  PERCENT  AGE  \ I F aspects*  both  the 
ad  hoc  individuals  and  the  disciplined  teams  coded  noticeably 
fewer  IF  statements  than  the  ad  hoc  teams*  in  terms  of  both 
total  number  and  percentage  of  total  statements. 

3.  According  to  the  DT  = AI  < AT  outcome  on  the  DECISIONS  aspect* 

both  the  ad  hoc  individuals  and  the  disciplined  teems  tended 
to  code  fewer  decisions  (i.e.,  IF,  WHILE*  or  CASE  statements) 
than  the  ad  hoc  teams. 

9.  Both  the  ad  hoc  teams  and  the  disciplined  teams  declared  a 

noticeably  larger  number  of  data  variables  (i.e.*  scalars  or 
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arrays  of  scalars)  than  the  act  hoc  indiviauals,  according  to 
the  AI  < AT  = OT  outcome  on  the  DATA  VARIABLES  aspect. 

10.  According  to  the  AT  = DT  < AI  outcome  on  the  DATA  VARIABLE 
SCOPE  PERCENTAGES\N0NGL0«AL\L0CAL  aspect,  the  ad  hoc 
individuals  had  a larger  percentage  of  local  variables 
compared  to  the  total  number  of  declared  aata  variables  than 
either  the  ad  hoc  teams  or  the  disciplines  teams. 

11.  There  was  a slight  trend  for  both  the  ad  hoc  individuals  and 
the  disciplined  teams  to  have  fewer  potential  data  bindings 
Cstevens,  flyers,  ana  Constantine  743  (i.e.,  occurrences  of 
the  situation  where  a global  variable  could  be  modified  by 
one  segment  and  accessed  by  another  due  to  the  software's 
modularization)  than  the  ad  hoc  teams,  according  to  the 

DT  = AI  < AT  outcome  on  the  (SEG, GLOBAL, SEG)  DATA  BINDINGS  V 
POSSIBLE  aspect. 

Noteworthy  aiSCSESiSO  distinctions  are  mentioned  below. 

1.  There  was  a noticeable  difference  in  variability,  with  the 

disciplined  teams  less  than  the  ad  hoc  individuals  less  than 
the  ad  hoc  teams,  in  the  maximum  number  of  unique 
compilations  for  any  one  module,  according  to  the 
DT  < AI  < AT  outcome  on  the  MAX  UNIQUE  COMPILATIONS  F.A.O. 
MODULE  aspect. 

. The  ad  hoc  individuals  exhibited  noticeably  greater  variation 
than  either  the  ad  hoc  teams  or  the  disciplined  teams  in  the 
number  of  miscellaneous  job  steps  (i.e.,  auxiliary 
compilations  or  executions  of  something  other  than  the  final 
software  project),  according  to  the  AT  = DT  < AI  outcome  on 
the  COMPUTER  J03  STEPS \MI SCELLANEOUS  aspect. 

3.  According  to  the  DT  - AI  < AT  outcome  on  the  AVERAGE  SEGMENTS 

PEP  MODULE  aspect,  the  ad  hoc  individuals  and  the  disciplined 
teams  both  exhioiteu  noticeably  less  variation  in  the  average 
number  of  routines  per  module  than  the  ad  hoc  teams. 

4.  According  to  the  DT  * AI  < AT  outcomes  on  the  STATEMENT  TYPE 

COUNTS \RETURN  and  STATEMENT  TYPE  PERCENT AGESVRETURN  aspects, 
the  ad  hoc  teams  showed  rather  noticeably  greater  variability 
in  the  number  (both  raw  count  and  normalized  percentage)  of 
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RETURN  statements  coded  than  both  the  disciplined  teams  and 
the  ad  hoc  individuals* 

5*  In  the  number  of  calls  to  p rog ramm e r-de f i ned  routines*  the  ad 
hoc  individuals  displayed  noticeably  greater  variation  than 
both  the  ad  hoc  teams  and  the  disciplined  teams*  according  to 
the  AT  = DT  < Ai  outcome  on  the  I NVOC A T I ONS \N0NI NTR I N S I C 
aspec  t • 

6.  According  to  the  DT  < AI  = AT  outcome  on  the  DATA  VARIABLES 

SCOPE  PERCENTA6ES\GL0EAL\N0NEhTRY\mcDIFIED  aspect,  the 
disciplined  teams  displayed  noticeably  smaller  variation  than 
either  the  ad  hoc  individuals  or  the  ad  hoc  teams  in  the 
percentage  of  commonplace  (i.e.,  ordinary  scope  and  modified 
during  execution)  global  variables  compared  to  the  total 
number  of  data  variables  declared. 

7.  The  ad  hoc  individuals  displayed  noticeably  less  variation  in 

the  number  of  formal  parameters  passed  by  reference  than  both 
the  ad  hoc  teams  and  the  disciplined  teams*  according  to  the 
AI  < AT  = DT  outcome  on  the  DATA  VARIABLE  SCOPE  COUNTSN 
N0N6L0BAL\PARA'1ETEP\RE  FE=»ENCE  aspect  . 

3.  According  to  the  AI  < DT  < AT  outcome  on  the  (SEG»GLOBAL) 
POSSIBLE  USAGE  PAIPS  aspect*  there  was  a noticeable 
difference  in  variacility*  with  the  ad  hoc  individuals  less 
than  the  disciplined  teams  less  than  the  ad  hoc  teams,  fcr 
the  total  number  of  possiole  segment-global  usage  pairs 
(i*e**  occurrences  of  the  situation  where  a global  variable 
could  be  modified  or  accessed  by  a segment). 

9.  According  to  the  DT  = AI  < AT  outcome  on  the  CSEG * GLOB AL , SE G ) 
DATA  BINCINGS\p0SSI5LE  aspect*  the  ad  hoc  teams  tended  toward 
greater  variability  than  either  the  ad  hoc  individuals  or  the 
disciplined  teams  in  the  number  of  potential  data  bindings. 
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v*  ismemjys  agsyiis 


This  section  reports  the  interpretive  results  of  the  study* 
namely  the  research  interpretations  based  on  the  conclusions 
presented  in  Section  IV*  The  tone  of  discussion  here  is  purposely 
somewhat  subjective  and  opinionated*  since  the  study's  most 
important  results  are  derived  from  interpreting  the  experiment's 
immediate  findings  in  view  of  the  study's  overall  goals.  These 
interpretations  also  express  the  researchers'  own  estimation  of 
the  study's  implications  and  general  import  according  to  their 
professional  intuitions  about  programming  and  software. 

The  interpretations  presented  here  are  neither  exhaustive  nor 
unique.  They  only  touch  upon  certain  overall  issues  and  generally 
avoid  attaching  meaning  to  or  giving  explanation  for  individual 
aspects  or  outcomes.  It  is  anticipated  that  the  reader  and  other 
researchers  might  formulate  additional  or  alternative 
interpretations  of  the  study's  factual  findings*  using  their  own 
intuitive  judgments. 

Two  distinct  sets  of  research  interpretations  are  discussed 
in  the  remainder  of  this  section.  The  first  set  states  general 
trends  in  the  conclusions  according  to  the  basic  suppositions  of 
the  study.  The  second  set  states  general  trenas  in  the 
conclusions  based  on  classifications  which  reflect  certain 
abstract  programming  notions  (e.g.*  cost*  modularity*  data 
organizations*  etc.). 

itascaics  ig  Easic  Sjgeaesiliaas: 

The  study's  basic  suppositions  are  a set  of  the  simplest  a 
priori  expectations  (or  “hypotheses")  for  the  outcomes  of  location 
and  oispersior.  comparisons  on  process  and  product  aspects.  They 
are  stated  in  the  following  table: 


■ 


. ..  , 
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Basic  Suppositions  I on  Location  ana  Dispersion 

I Comparisons 


?o7 

Process 

Aspects 

-4  — 
1 

DT  < A I = AT 

for 

Produc  t 

Aspects 

1 

DT  * AI  < AT  or  AT  < DT  = AI 

The  basic  suppositions  are  founded  upon  certain  general 
beliefs  regarding  software  development*  which  had  been  formulated 
by  the  researchers  prior  to  conducting  the  experiment.  The 
principal  oeliefs  are  that 

(a)  methodological  discipline  is  the  key  influence  on  the 
general  efficiency  of  the  process  itself. 

(d>  the  disciplined  methodology  reduces  the  cost  and 
complexity  of  the  process  and  enhances  the 
predictability  of  the  process  as  well. 

(c)  the  preferred  direction  of  both  location  and  dispersion 

differences  on  process  aspects  is  clear  and  undebatable. 
due  to  the  tangibleness  of  the  process  aspects 
themselves  and  the  direct  applicability  of  expected 
values  and  variations  in  terms  of  average  cost  estimates 
and  tightness  of  cost  estimates. 

(d)  "mental  cohesiveness"  (or  conceptual  integrity  CBrooks 

75;  pp.  41-5GD)  is  the  key  influence  on  the  general 
quality  of  the  product  itself. 

(e)  a programming  team  is  naturally  burdened  (relative  to  an 

individual  programmer)  by  the  organizational  overhead 
and  risk  of  error-prone  misunderstanding  inherent  in 
coordinating  and  interfacing  the  thoughts  and  efforts  of 
those  on  the  team. 

(f)  the  disciplined  methodology  induces  an  effective  mental 

cohesiveness*  enabling  a programming  team  to  behave  more 
like  an  individual  programmer  with  respect  to  conceptual 
control  over  the  program,  its  design,  its  structure, 
etc*,  because  of  the  discipline's  ant i regres s i ve . 
complexity-controlling  CEelady  and  Lehman  76;  p.  245] 
effect  that  compensates  for  the  inherent  organizational 
overhead  of  a team,  and 

(g)  the  preferred  direction  of  both  location  and  dispersion 
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differences  on  product  aspects  is  not  always  clear 
(occasionally  subject  to  diverging  viewpoints)*  due  tc 
the  intangibleness  of  many  of  the  product  aspects  and  a 
general  lack  of  unoer s t andi ng  regarding  the  implication 
of  dispersion  comparisons  themselves  for  product 
aspects. 

Against  the  background  of  these  general  beliefs  and  basic 
suppositions*  each  possiole  comparison  outcome  takes  on  a new 
meaning*  depending  on  whether  it  would  substantiate  or  contravene 
the  general  beliefs.  For  process  aspects* 

(1)  outcome  DT  < AI  = AT,  the  supposition  itself*  is  directly 

supportive  of  the  beliefs; 

(2)  outcomes  OT  < AI  < AT  and  OT  < AT  < «I,  which  are 

completely  differentiated  variations  of  the 
supposition's  main  theme,  are  indirectly  supportive  of 
the  beliefs*  especially  when  DT  < AI  = AT  is  the 
stronger  of  the  two  corresponding  partially 
differentiated  outcomes; 

(3)  outcome  AI  = AT  = DT  may  discredit  the  beliefs,  or  it  may 

be  considered  neutral  for  anyone  of  several  possible 
reasons  C(a)  the  critical  level  for  a non-null  outcome 
is  just  not  low  enough*  so  the  aspect  defaults  to  the 
null  outcome;  (b>  the  aspect  simply  reflects  something 
characteristic  of  the  application  itself  (or  another 
factor  common  to  all  the  groups  in  the  experiment);  or 
(c)  the  aspect  actually  measures  something  fundamental 
to  the  software  development  phenomenon  in  general  and 
would  always  result  in  the  null  outcome);  and 

(4)  all  other  outcomes  discredit  the  beliefs. 

For  product  aspects* 

(1)  outcomes  AT  * DT  = AI  C AT  < DT  = AI,  DT  = AI  < A T 3 , the 

supposition  itself*  are  directly  suDportive  of  the 
beliefs; 

(2)  outcomes  AI  < DT  < AT  and  AT  < DT  < AI,  which  may  be 

considered  as  approximations  of  the  suppositions  (DT  is 
distinct  from  AT  but  falls  short  of  AI,  due  to  lack  of 
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experience  or  maturity  in  the  disciplined  methodology), 
are  indirectly  supportive  of  the  beliefs,  especially 
when  DT  = AI  < AT  and  AT  < DT  = AI  (respectively)  are 
the  stronger  of  the  two  corresponding  partially 
differentiated  outcomes; 

(3)  outcome  AI  = AT  = DT  may  discredit  the  beliefs,  or  it  may 

ne  considered  neutral  for  anyone  of  several  possible 
reasons  C(a)  the  critical  level  for  a non-null  outcome 
is  just  not  low  enough,  so  the  aspect  defaults  to  the 
null  outcome;  (b)  the  aspect  simply  reflects  something 
characteristic  of  the  application  itself  (or  another 
factor  common  to  all  the  groups  in  the  experiment);  (c) 
the  aspect  actually  measures  something  fundamental  to 
the  software  development  phenomenon  in  general  and  would 
always  result  in  the  null  outcome;  or  (d)  several  of  the 
study's  hit-and-miss  collection  of  "exploratory"  product 
aspects  are  simply  duds  and  may  be  ignored  as  useless 
software  measures];  and 

(4)  all  other  outcomes  discredit  one  or  more  of  the  beliefs. 

Thus  the  interpretation  of  the  study's  findings  according  to 
the  basic  suppositions  consists  simply  of  a general  assessment  of 
how  well  the  research  conclusions  have  borne  out  the  basic 
suppositions  and  how  well  the  experimental  evidence  substantiates 
the  general  beliefs.  On  the  whole,  the  study's  findings 
positively  support  the  general  beliefs  presented  above,  although  a 
few  conclusions  exist  which  are  directly  inconsistent  with  the 
suppositions  or  difficult  to  allay  i ndi vi dua l ly . 

Support  for  the  beliefs  was  relatively  stronger  on  process 
aspects  than  on  product  aspects,  and  in  location  comparisons  than 
in  dispersion  comparisons.  Overwhelming  support  came  in  the 
category  of  location  comparisons  on  process  aspects  in  which  the 
research  conclusions  are  di st ingui shed  by  extremely  low  critical 
levels  and  by  near  unanimity  with  the  basic  supposition.  In  the 
category  of  dispersion  comparisons  on  process  aspects,  only  two 
outcomes  indicated  any  distinction  among  the  groups:  one  aspect 
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supported  the  study's  beliefs  and  one  aspect  showed  an  explainable 
exception  to  then*  Fairly  strong  support  also  cane  in  the 
category  of  location  comparisons  on  product  aspects  for  which  the 
only  negative  evidence  (oesides  the  neutral  A I = AT  = DT 
conclusions)  appeared  in  the  form  of  several  Ai  t AT  = DT 
conclusions.  They  indicate  some  areas  in  which  the  disciplined 
methodology  was  apparently  ineffective  in  modifying  a team's 
behavior  toward  that  of  an  individual*  probably  due  to  a lack  of 
fully  developed  t ra i ni ng/e xpe r i ence  with  the  methodology. 
Comparatively  weaker  support  for  the  study's  beliefs  was  recorded 
in  the  category  of  dispersion  comparisons  on  product  aspects. 
Although  the  suppositions  were  borne  out  in  a number  of  the 
conclusions,  there  were  also  several  distinctions  of  various  forms 
which  contravene  the  suppositions. 

Thus,  according  to  this  interpretation,  the  study's  findings 
strongly  substantiate  the  claims  that 

(a)  methodological  discipline  is  the  key  influence  on  the 

general  efficiency  of  the  software  development  process, 
and  that 

(b)  the  disciplined  methodology  significantly  reduces  the 

material  costs  of  software  development. 

The  claims  that 

(a)  mental  cohesiveness  is  the  key  influence  on  the  general 
quality  of  the  software  development  product,  that 

(o)  an  ad  hoc  team  is  mentally  burdened  by  organizational 
overhead  compared  to  an  individual,  and  that 

(c)  the  disciplined  methodology  offsets  the  mental  burden  of 

organ i z a t iona l overhead  and  enables  a team  to  behave 
more  like  an  individual  relative  to  the  product  itself, 
are  moderately  substantiated  by  the  study 's  findings,  with 
particularly  mixed  evidence  for  dispersion  comparisons  on  product 
aspects. 

li  ShfiyiS:  D2IJ2  that  there  is  a simpler,  better-supported 

interpretive  model  for  the  location  results  alone.  With  the 
beliefs  that  a disciplined  methodology  provides  for  the  minimum 
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process  cost  and  results  in  a product  which  in  some  aspects 
approximates  the  product  of  an  individual  and  at  worst 
approximates  the  product  developed  by  an  ad  hoc  team,  the 
suppositions  are  DT  < A I , AT  with  respect  to  process  and 
A I < D T < AT  or  AT  < DT  < AI  with  respect  to  product.  The  study's 
findings  support  these  suppositions  without  exception. 

£££S£2lDS  12  ££2&I£SSl22  5§B££l  £A  JlSiii£2ll2S : 

Before  presenting  the  interpretations  according  to  a 
classification  of  the  programming  aspects,  an  explanation  is  in 
orcer  regarding  this  classification  and  its  motivation.  It  is 
desirable  to  make  general  interpretations  in  view  of  the  way 
certain  general  programming  issues  are  reflected  among  the 
individual  programming  aspects.  For  this  purpose,  the  aspects 
considered  in  this  study  were  grouped  into  (so-called)  programming 
aspect  classes.  Bach  class  consists  of  aspects  which  are  related 
by  some  common  feature  (for  example,  all  aspects  relating  to  the 
pregram's  statements,  statement  types,  statement  nesting,  etc.), 
anc  the  classes  are  not  necessarily  disjoint  (i.e.,  a given  aspect 
may  be  included  in  two  or  more  classes).  A unique  higher-level 
programming  issue  (in  the  example,  control  structure  organization) 
is  associated  with  each  class. 

The  programming  aspects  of  this  study  were  organized  into  a 
hierarchy  of  nine  aspect  classes  (with  about  10X  overlap  overall), 
outlined  as  follows: 


!j  19 t!££-i£¥£ i 2£2a£22!DlD2  Issue;  C^ass: 

Development  Process  Efficiency 

Effort  (Job  Steps)  I 

Errors  (Program  Changes)  . II 

Final  Product  Quality 

Gross  Size Ill 

Control-Construct  Structure  IV 

Data  Variable  Orcanization  .....  V 
rodulari ty 

Packaging  Structure  .VI 

Invocation  Organization  .....  VII 
Inter-Segment  Communication 

Via  Parameters VIII 

Via  Global  Variables  ••.••••IX 


The  individual  aspects  comprising  each  class,  together  with  the 
corresponding  conclusions,  are  listed  by  classes  in  Tables  5.1 
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Table  5.1  Conclusions  for  Class  I,  Effort  (Job  Steps) 


****8 

1 

HM 

location  1 

***< 

hm 

dispersion  ! 

programming  aspect 

1 comparison 

critical  1 

comparison 

:critical I 

1 

outcome 

level  1 

outcome 

: level  1 

COMPUTER  JOB  STEPS 

1 DT 

< 

AI 

m 

AT 

0.0036  1 

a 

m 

MODULE  COMPILATIONS 

1 DT 

< 

AI 

m 

AT 

0.0223  1 

m 

m 

UNIQUE 

1 DT 

< 

AI 

m 

AT 

0.0110  1 

m 

m 

s 1 

IDENTICAL 

1 

m 

m 

1 

a 

m 

: 1 

PROGRAM  EXECUTIONS 

1 DT 

< 

AI 

m 

AT 

0.0221  I 

m 

m 

: 1 

MISCELLANEOUS 

1 DT 

< 

AI 

m 

AT 

0.1445  1 

AT 

m 

DT  < AI 

: 0.0775  1 

ESSENTIAL  JOB  STEPS 

1 DT 

< 

AI 

a 

AT 

0.0037  j 

• 

■ 

: 1 

AVERAGE  UNIQUE  COMPILATIONS  PER  MODULE 

1 DT 

< 

AI 

m 

AT 

0.0883  1 

m 

B 

MAX  UNIQUE  COMPILATIONS  F.A.O.  MODULE 

1 DT 
****< 

< 

t *i 

AI 

*** 

m 

AT 

k**« 

: 

t* 

0.1180  1 

DT 

***« 

< 

AI  < AT 

ti  .0514  1 

alternative  conclusions  (from  Table  4)  showing  relaxed  differentiation: 
(correspondence  indicated  via  the  t symbol) 


I I DT  < AI  ■ AT  :S  .0036  I 

I I DT  • AI  < AT  :4  .0511  I 


Table  5.2  Conclusions  for  Class  II,  Errors  (Program  Changes) 


1 

1 location  1 

dispersion  1 

1 programming  aspect 

1 comparison  :criticall 

comparison  :criticall 

1 

1 outcome  : level  1 

outcome  : level  1 

1 PROGRAM  CHANGES 

1 DT  < AI  < AT  :t  .1848  I 

- - : 1 

alternative  conclusions  (from  Table  4)  showing  relaxed  differentiation: 
(correspondence  indicated  via  the  t symbol) 


I 

I 


I DT  < AI  ■ AT  :t  .0037  I 
I DT  - AI  < AT  :t  .1846  I 
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Table  5.3  Conclusion*  foe  Class  III,  Gross  Size 


programming  aspect 


location 

comparison  :critical 
outcome  > level 


dispersion 

comparison  :critical 
outcome  : level 


I MODULES 

I AVERAGE  SEGMENTS  PER  MODULE 
I AVERAGE  GLOBAL  VARIABLES  PER  MODULE 


I SEGMENTS 

I AVERAGE  STATEMENTS  PER  SEGMENT 
I AVERAGE  NONGLOBAL  VARIABLES  PER  SEGMENT 
I PARAMETER 
I LOCAL 


I DATA  VARIABLES 

I DATA  VARIABLE  SCOPE  COUNTS  \ GLOBAL 
I DATA  VARIABLE  SCOPE  COUNTS  \ NONGLOBAL 
I PARAMETER 
I LOCAL 


I LINES 
I STATEMENTS 

I AVERAGE  TOKENS  PER  STATEMENT 


- AI  < AT  : 0.0218 


AI  < AT  - DT 
AT  » DT  < AI 

m m 

AI  < AT  • DT 


AI  < AT  - DT 
AI  < AT  - DT 

m m 

AI  < AT  - DT 


AI  < DT  < AT 


0.0634 

0.1706 

0.1748 


0. 0698 
0.1476 

0.1271 


- AT  < DT  : 0.1241 
• AT  < DT  0.1061 


< DT  • AI 
- AT  < DT 


: 0.1954 
: 0.1061 


I TOKENS 


alternative  conclusions  (from  Table  41  showing  relaxed  differentiation: 
(correspondence  indicated  via  the  t symbol) 


I DT  - AI  < AT  :t  .0617  I 
I AI  < AT  - DT  it  .1132  I 
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Table  5.4  Conclusions  for  Class  IV,  Control-Construct  Structure 


1 

k **** 

location 

* 

1 

***1 

k* 

dispersion 

j programming  aspect 

1 

comparison 

critical! 

comparison 

xcritical 

1 

outcome 

level 

1 

outcome 

: level 

| STATEMENTS 

1 

m 

m 

1 

AT 

< 

DT  - 

AI 

: 0.1954 

i STATEMENT  TYPE  COUNTS  : 

1 

1 

: 

1 : ■ 

1 

m 

m 

1 

m 

■ 

s 

i ip 

1 

DT 

• AI 

< 

AT 

0.0780 

1 

m 

m 

t 

1 CASE 

1 

m 

* 

1 

m 

m 

i 

1 WHILE 

1 

m 

m 

1 

m 

m 

i 

1 EXIT 

1 

m 

as 

I 

m 

m 

: 

1 (PROC) CALL 

1 

m 

m 

1 

DT 

< 

AI  » 

AT 

: 0.0325 

1 NONINTRINSIC 

1 

m 

m 

1 

DT 

< 

a:  - 

AT 

: 0.1862 

1 INTRINSIC 

1 

DT 

- AI 

< 

AT 

0.1732 

1 

m 

• 

s 

1 RETURN 

j 

DT 

- AI 

< 

AT 

0.0860 

1 

DT 

< 

AI  < 

AT 

:6  .1398 
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alternative  conclusions  (from  Table  4)  showing  relaxed  differentiation: 
(correspondence  indicated  via  the  & symbol) 
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Table  5.6  Conclusions  for  Clsss  VI,  Packaging  Structure 
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Table  5.8  Conclusions  for  Class  VIII,  Communication  via  Parameters 
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Table  5.9  Conclusions  for  Class  IX,  Communication  via  Global  Variables 
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alternative  conclusions  (from  Table  4)  shoving  relaxed  differentiation: 
(correspondence  indicated  via  the  t,  9,  4,  and  $ symbols) 
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throucn  5.9.  For  each  aspect  class*  it  is  interesting  to  jointly 
interpret  the  individual  outcomes  in  an  overall  manner  in  order  to 
see  something  cf  how  these  higher-level  issues  are  affected  by  the 
factors  of  team  size  and  methodological  discipline. 

Class  1 : 


within  Class  I (process  aspects  dealing  with  COMPUTER  JOB 
STEPS),  there  is  strong  evidence  of  an  important  difference  among 
the  groups,  in  favor  of  the  disciplined  methodology,  with  respect 
to  average  development  costs.  As  a class,  these  aspects  directly 
reflect  the  frequency  of  computer  system  operations  (i.e.,  module 
c on p i l a t i cns  and  test  program  executions)  during  development. 

The/  are  one  possible  way  of  measuring  machine  costs,  in  units  of 
basic  operations  rather  than  monetary  charges.  Assuming  each 
computer  system  ope  ration  involves  a certain  expenditure  of  the 
prograaxer's  time  and  effort  (e.g.,  effective  terminal  contact, 
test  result  evaluation),  these  aspects  indirectly  reflect  human 
costs  of  development  (at  least  that  portion  not  devoted  to  design 
work)  . 


Tne  strength  of  the  evidence  supporting  a difference  with 
respect  to  location  comparisons  within  this  class  is  based  on  both 
(a)  the  near  unanimity  Co  out  of  9 aspects}  of  the  DT  < AI  * AT 
outcome  and  (b)  the  very  low  critical  levels  [<.025  for  5 aspects} 
involved.  Indeed,  the  single  exception  among  the  location 
comparisons  (AI  = AT  = OT  on  COMPUTER  JOB  STEPS\MODULE 
COMPILATIONSMOENTICAL)  is  readily  explained  as  a direct 
consequence  of  the  fact  that  all  teams  made  essentially  similar 
us SjC  (or  nonuse,  in  this  case,  since  identical  compilations  were 
not  uncommon)  of  the  on-line  storage  capability  (for  saving 
relocataole  modules  ano  thus  avoiding  identical  recompilations). 
This  was  expected  since  all  teams  had  been  provided  with  identical 
storage  capability,  but  without  any  training  or  urging  to  use  it. 
The  conclusions  on  location  comparisons  within  this  class  are 
interpreted  as  demonstrat ing  that 

employment  of  the  disciplined  methodology  by  a 
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programming  team  reduces  the  average  costs*  both  machine 
and  human*  of  software  development*  relative  to  both 
indivioual  programmers  and  programming  teams  not 
employing  the  methodology* 

Examination  of  the  raw  data  scores  themselves  indicates  the 
magnitude  of  this  reduction  to  be  on  the  order  of  2 to  1 (i*e.» 
50X)  or  better. 

bith  respect  to  dispersion  comparisons  within  this  class*  the 
evidence  generally  faileo  to  make  any  distinctions  among  the 
groups  TAI  = AT  = DT  on  7 out  of  9 aspects]*  These  null 
conclusions  in  dispersion  comparisons  are  interpreted  as 
demonstrating  that 

variability  of  software  development  costs*  especially 
machine  costs*  is  relatively  insensitive  to  the  factors 
of  programming  team  size  and  degree  of  methodological 
discipline. 

The  two  exceptions  on  individual  process  aspects  both  deserve 
mention.  The  COMPUTER  JOB  STEPS\MISCELLANEOUS  aspect  showed  a 
AT  = DT  < A I dispersion  distinction  among  the  groups*  reflecting 
the  wider-spread  behavior  (as  expected)  of  individual  programmers 
relative  to  programming  teams  in  the  area  of  building  on-line 
tools  to  indirectly  support  software  development  (e.g.* 
stand-alone  module  drivers*  one-shot  auxiliary  computations*  table 
generators*  unanticipated  debugging  stubs*  etc.).  The  MAX  UNIQUE 
COMPILATIONS  F.A.O.  MODULE  aspect  showed  a DT  < AI  * AT  dispersion 
distinction  among  the  groups  at  an  extremely  low  critical  level 
[<.GG5j*  reflecting  the  lower  variation  (increased  predictability) 
of  the  disciplined  teams  relative  to  the  ad  hoc  teams  and 
inaiviouals  in  terms  of  “worst  case"  compilation  costs  for  any  one 
mouule.  The  additional  AI  < AT  distinction  for  this  comparison  is 
clearly  attributable  to  the  fact  that  several  teams  in  group  at 
builc  monolithic  si ng l e-moou l e systems*  yielding  rather  inflated 
raw  scores  for  this  aspect. 
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Class  II: 

within  Class  II  (the  process  aspect  PR0G°A;i  CHANGES),  there 
is  strora  evidence  of  an  important  difference  among  the  sroups, 
again  in  favor  of  the  disciplined  methodology,  with  respect  to 
average  number  of  errors  encountered  during  implementation. 
Appendix  1 contains  a detailed  explanation  of  how  program  changes 
are  counted.  This  aspect  directly  reflects  the  amount  of  textual 
revision  to  the  source  code  during  (postdesign)  development. 
Claiming  that  textual  revisions  are  generally  necessitated  by 
errors  encountered  while  building,  testing,  and  debugging 
software,  recent  research  CDunsmore  and  Gannon  773  has  confirmed  a 
high  (rank  order)  correlation  of  total  program  changes  (as  counted 
aut oma t i c a 1 1 y according  to  a specific  algorithm)  with  total  error 
occurrences  (as  tabulated  manually  from  exhaustive  scrutiny  of 
source  code  and  test  results)  during  software  implementation. 

This  aspect  is  thus  a reasonable  measure  of  the  relative  number  of 
programming  errors  encountered  outside  of  desian  work.  Assuming 
each  textual  revision  involves  a certain  expenditure  of  the 
programmer's  effort  (e.g.,  planning  the  revision,  on-line  editing 
of  source  code),  this  aspect  indirectly  reflects  the  level  of 
hum «n  effort  devoted  to  implementation. 

with  respect  to  location  comparison,  the  strength  of  the 
evioence  supporting  a difference  among  the  groups  is  based  on  the 
very  low  critical  level  C<.CQ5J  for  the  DT  < Al  = AT  outcome.  The 
additional  trend  toward  AI  < AT  is  much  less  pronounced  in  the 
dato.  The  interpretation  is  that 

the  disciplined  methodology  effectively  reduced  the 
average  number  of  errors  encountered  during  software 
implementation. 

This  was  expecteo  since  the  methodology  purposely  emphasizes  the 
criticality  of  the  design  phase  and  subjects  the  software  design 
(code)  to  through  reading  and  review  prior  to  coding  (key-in  or 
testing),  enhancing  error  detection  and  correction  prior  to 
implementation  (testing). 
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with  respect  to  dispersion  comparison,  no  distinction  among 
the  groups  was  apparent,  with  the  interpretation  that 

variability  in  the  number  of  errors  encountered  during 
implementation  was  essentially  uniform  across  all  three 
programming  environments  considered* 


Class  III: 

Within  Class  III  (product  aspects  dealing  with  the  gross  size 
of  the  software  at  various  hierarchical  levels),  there  is  evidence 
of  certain  consistent  differences  among  the  groups  with  respect  to 
both  average  size  and  variability  of  size.  As  a class,  these 
aspects  directly  reflect  the  number  of  objects  and  the  average 
number  of  component  (sub)oojects  per  object,  according  to  the 
hierarchical  organization  (imposed  by  the  programming  language)  of 
the  software  itself  into  objects  such  as  modules,  segments,  data 
variables,  lines,  statements,  ano  tokens. 

With  respect  to  location  comparisons  within  this  class,  the 
non-null  conclusions  17  out  of  17  aspects!  are  nearly  unanimous  C5 
out  of  73  in  the  AI  < AT  = DT  outcome.  The  interpretat ion  is  that 
individuals  tend  to  produce  software  which  is  smaller  (in  certain 
ways)  on  the  average  than  that  produced  by  teams.  It  is  unclear 
whether  such  spareness  of  expression,  primarily  in  segments, 
glcoal  variables,  and  formal  parameters,  is  advantageous  or  net. 
The  two  non-null  exceptions  to  this  AI  < AT  = DT  trend  deserve 
mention,  since  the  one  is  only  nominally  exceptional  and  actually 
supportive  of  the  tendency  upon  closer  inspection,  while  the  other 
inaicates  a size  aspect  in  which  the  disciplined  methodology 
enabled  programming  teams  to  break  out  of  the  pattern  of 
distinction  from  individual  programmers.  The  AT  = DT  < AI  outcome 
on  AVERAGE  STATEMENTS  PER  SEGMENT  is  a simple  consequence  of  the 
outcome  for  the  number  of  STATEMENTS  (AI  = AT  ~ DT)  and  the 
outcome  for  the  number  of  SEGMENTS  (AI  < AT  * DT)  and  it  still 
fits  the  overall  pattern  of  AI  t AT  = DT  on  location  differences 
on  size  aspects.  Gn  the  LINES  aspect,  the  DT  = A I < AT 
distinction  breaks  the  pattern  since  DT  is  associated  with  AI  and 


TR-dSS  Section  V 


72 


r 

a 


■ 


not  with  AT.  Since  the  number  of  statements  was  roughly  the  same 
for  all  three  groups,  this  difference  must  be  due  mainly  to  the 
stylistic  manner  of  arranging  the  source  code  (which  was 
f roe-format  with  respect  to  line  boundaries),  to  the  amount  cf 
documentation  comments  within  the  source  code,  and  to  the  number 
of  lines  taken  up  in  data  variable  declarations. 


j 

! 
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With  respect  to  dispersion  comparisons  within  this  class,  the 
few  aspects  which  do  indicate  any  distinction  among  the  groups  [5 
cut  of  17  aspects]  seem  to  concur  or.  the  AI  = AT  < DT  outcome. 

This  pattern,  which  associates  increased  variation  in  certain  size 
aspects  with  the  disciplined  methodology,  is  somewhat  surprising 
anc  lacks  an  intuitive  explanation  in  terms  of  the  experimental 
factors.  The  exception  DT  = AI  < AT  on  AVEPAGE  SEGMENTS  PER 
MOIJLE  is  really  an  exaggeration  due  to  the  fact  of  several  AT 
teams  implementing  monolithic  single-module  systems,  as  mentioned 
above.  The  exception  AT  < DT  = AI  on  STATEMENTS  is  only  a very 
slight  trend,  reflecting  the  fact  that  the  AT  products  rather 
consistently  contained  the  largest  numbers  of  statements. 


One  overall  observation  for  Class  III  is  that  while  certain 
distinctions  did  consistently  appear  (especially  for  location  but 
also  for  dispersion  comparisons)  at  the  middle  levels  of  the 
hierarchical  scale  [segments,  data  variaoles?  lines,  and 
statements],  no  Distinctions  appeared  at  either  the  highest 
[modules]  or  lowest  [tokens]  levels  of  size.  The  null  conclusions 
for  size  in  modules  and  average  module  size  seem  attributable  to 
the  fact  that  particular  programming  tasks  or  application  domains 
often  have  certain  standard  approaches  at  the  topmost  conceptual 
levels  which  strongly  influence  the  organization  of  software 
systems  at  this  highest  level  of  gross  size.  In  this  case,  the 
twc-pass  symbcl-table/scanninc/parsing/code-generation  approach  is 
extremely  common  for  language  translation  problems  (i.e., 
compilers),  regardless  of  the  particular  parsing  technique  or 
symool  table  organization  employed,  and  the  mocules  cf  nearly 
every  system  in  the  study  airectly  reflected  this  common  approach. 
The  null  conclusions  for  size  in  tokens  is  interpretable  in  view 


i 
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of  Halstead's  software  science  concepts  CHalstead  773,  according 
to  which  the  program  length  N is  predictable  from  the  number  of 
basic  input-output  parameters  and  the  language  level  X.  Since  the 
functional  specification,  the  application  area,  and  the 
i mp  le ment a t i on  language  were  all  fixed  in  the  study,  both  and  X 
are  essentially  constant  for  each  of  the  software  systems, 
implying  essentially  constant  lengths  N as  measured  in  terms  of 
operators  and  operands.  Considering  the  number  of  tokens  as 
roughly  equivalent  to  program  length  N,  the  study's  data  seem  to 
support  the  software  science  concepts  in  this  instance. 

Class  IV: 

within  Class  IV  (product  aspects  dealing  with  the  software's 
organization  according  to  statements,  constructs,  and  control 
structures),  there  are  only  a few  distinctions  made  between  the 
groups . 

With  respect  to  location  comparisons,  the  few  [5  out  of  24  3 
aspects  that  showed  any  distinction  at  all  were  unanimous  in 
conclujing  DT  * AI  < AT.  Essentially,  three  particular  issues 
were  involved.  The  STATEMENTS  TYPE  COUNTSXIF,  STATEMENT  TYPE 
PF RCENTAG EST I F , and  DECISIONS  aspects  are'  all  related  to  the 
frequency  of  prog rammer-coaed  decisions  in  the  software  product. 
Their  common  outcome  DT  = AI  < AT  is  interpreted  as  demonstrating 
an  important  area  in  which  the  disciplined  methodology  causes  a 
programming  team  to  behave  like  an  individual  programmer.  The 
number  of  decisions  has  been  commonly  accepted,  and  even 
formalized  [McCabe  763,  as  a measure  of  program  complexity  since 
more  decisions  create  more  paths  through  the  cooe.  Thus,  the 
disciplined  methodology  effectively  reduced  the  average  complexity 
from  wnat  it  otherwise  would  have  been.  The  STATEMENT  TYPE 
COUNT S\pETURN  aspect  indicates  a difference  between  the  ad  hoc 
teams  and  the  other  two  groups.  Since  the  EXIT  and  RETURN 
statements  are  restricted  forms  of  GOTOs,  this  difference  seems  to 
hint  at  another  area  in  which  the  disciplined  methodology  improves 
conceptual  control  ever  program  structure.  The  STATEMENT  TYPE 
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COL  f.T$\(PROC)CALL\  INTRINSIC  aspect  also  indicates  a slight  trend  in 
the  area  of  the  frequency  of  input -output  operations,  which  seers 
i nt erp re t ac l e only  as  a result  of  stylistic  differences. 

With  respect  to  dispersion  comparisons,  only  two  particular 
is.ues  were  involved.  The  STATEMENT  TYPE  COUNTSXRETURN  and 
STATEMENT  TYPE  PERC ENT  AGE \R ETURN  aspects  both  indicated  a strong 
D T = A I < AT  difference,  suggesting  that  the  frequency  of  these 
restricted  GOTCs  is  an  area  in  which  the  disciplined  methodology 
recuces  variability,  causing  a programming  team  to  behave  more 
like  an  individual  programmer.  The  STATEMENT  TYPE  C0UNTS\ 
(PACCKALL  and  STATEMENT  TYPE  COUNTS  \ (P  RO  C ) C A LL  \ N ON  I N T R I N S I C 
aspects  coth  showed  a OT  < AI  - AT  distinction  among  the  groups, 
tfhich  is  dealt  with  more  appropriately  within  Class  VII  below. 

In  summary  of  Class  IV,  the  interpretation  is  that  the 
functional  component  of  c on t ro l- cons t rue t organization  is  largely 
unaffected  by  the  team  size  and  methodological  discipline  factors, 
prooably  due  to  the  overriding  effect  of  project/task 
un i f o rmi t y / commona l i t y • However,  two  facets  of  the  control 
component  that  were  influenced  were  the  frequency  of  decisions 
(especially  IF  statements)  and  the  frequency  of  restricted  GOTOs 
(especially  RETURN  statements).  For  these  aspects,  the 
disicplineo  methodology  altered  the  control  structure  (and  reduced 
the  complexity)  of  a team's  product  to  that  of  an  individual's 
product. 

Class  V : 

Within  Class  V (product  aspects  dealing  with  data  variables 
anc  their  organization  within  the  software),  there  are  several 
distinctions  among  the  groups,  with  an  overall  trend  for  both  the 
location  and  dispersion  comparisons.  Data  variable  organization 
was,  however,  not  emphasized  in  the  disciplined  methodology,  nor 
in  the  academic  course  which  the  participants  in  group  DT  were 
taking.  with  respect  to  location  comparisons,  all  aspects  shewing 
any  oistinction  at  all  were  unanimous  in  concluding  A I i AT  - DT. 
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The  trend  for  individuals  to  differ  from  teams,  regardless  of  the 
disciplined  methodology,  appears  not  only  for  the  total  number  of 
data  variables  declared,  but  also  for  data  variables  at  each  scope 
level  (global,  parameter,  local)  in  one  fashion  or  another*  The 
difference  regarding  formal  parameters  is  especially  prominent, 
since  it  shows  up  for  their  raw  count  frequency,  their  normalized 
percentage  frequency,  ana  their  average  frequency  per  natural 
enclosure  (segment).  with  respect  to  dispersion  comparisons,  the 
apparent  overall  trend  for  aspects  which  show  a distinction  is 
toward  the  AI  = AT  < DT  outcome.  No  particular  i nt erpret a t i or  in 
vie.  of  the  experimental  factors  seems  appropriate.  Exceptions  to 
this  trend  appeared  for  ooth  the  raw  count  and  percentage  of 
ca l l-b/-ref e rence  paranenters  (both  AI  < AT  - DT),  as  well  as  two 
other  aspects. 

Class  VI: 

Within  Class  VI  (product  aspects  dealing  with  modularity  in 
terms  of  the  packaging  structure),  there  are  essentially  no 
distinctions  among  the  groups,  except  for  two  location  comparison 
issues.  Most  of  the  aspects  in  this  class  are  also  members  of 
Class  ill.  Gross  Size,  but  are  ( re ) cons i aered  here  to  focus 
attention  upon  the  packaging  characteristics  of  modularity  (i.e., 
how  the  source  code  is  divided  into  modules  and  segments,  what 
type  of  segments,  etc.).  disciplined  methodology  did  not 

explicitly  include  (nor  dia  group  DT"s  course  work  cover)  concepts 
of  modu la r i za t i on  or  criteria  for  evaluating  good  modularity; 
hence,  no  particular  distinctions  among  the  groups  were  expected 
in  this  area  (Classes  VI  and  VII). 

with  respect  to  location  comparisons,  the  AI  < AT  = DT 
outcome  for  the  SEGMENTS  aspects,  along  with  the  companion  outcome 
AT  » DT  < AI  for  the  AVERAGE  STATEMENTS  PER  SEGMENT  aspect  (as 
etcleinec  under  Class  III  above),  indicates  one  area  of  packaging 
i». i 1 1 apparently  sensitive  to  the  team  size  factor.  Individual 
.fiMtn  built  the  system  with  fewer,  but  larger  (on  the 
..  • . , segments  than  either  the  ad  hoc  teams  or  the  disciplined 
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teams.  The  AI  < AT  = DT  outcome  for  the  AVERA6E  NONGLOBAL 
VARIABLES  PER  SE6MENTNPARAMETER  aspect  indicates  that  average 
"calling  sequence"  length,  curiously  enough,  is  another  area  of 
packaging  sensitive  to  team  size.  With  respect  to  dispersion 
comparisons,  there  really  were  no  differences,  since  the  single 
non-null  outcome  for  AVERAGE  SEGMENTS  PER  MODULE  is  actually  a 
fluke  (raw  scores  for  AT  are  exaggerated  by  the  several  monolithic 
systems)  as  explained  above.  The  overall  interpretation  for  this 
class  is  that 

modularity,  in  the  sense  of  packaging  code  into  segments 
and  mocules,  is  essentially  unaffecteo  by  team  size  or 
methodological  discipline,  except  for  a tendency  by 
indiviaual  programmers  toward  fewer,  longer  segments 
than  programming  teams. 


Class  VII : 

Within  Class  VII  (product  aspects  dealing  with  modularity  in 
terms  of  the  invocation  structure),  there  are  two  distinction 
trends  for  location  comparisons,  but  no  clear  pattern  for  the 
dispersion  comparison  conclusions.  This  class  consists  of  raw 
counts  and  a ve rage-pe r-segment  frequencies  for  invocations 
(procedure  CALL  statements  or  function  references  in  expressions) 
anc  is  considerea  separately  from  the  previous  class  since 
mocularity  involves  not  only  the  manner  in  which  the  system  is 
packaged,  but  also  the  frequency  with  which  the  pieces  are 
invoked.  For  the  raw  count  frequencies  of  calls  to  intrinsic 
procedures  and  intrinsic  routines,  the  trend  is  for  the 
individuals  and  disciplined  teams  to  exhibit  fewer  calls  than  the 
ad  hoc  teams.  These  intrinsic  procedures  are  almost  exclusively 
the  input-output  operations  of  the  language,  while  the  intrinsic 
functions  are  mainly  data  type  conversion  routines.  The  second 
treno  for  location  comparisons  occurs  for  two  aspects  (a  third 
aspect  is  actually  redundant)  related  to  the  average  frequency  of 
calls  to  prog ramme r-de f ined  routines,  in  which  the  individuals 
display  higher  average  frequency  than  either  type  of  team.  This 
seems  coupled  with  group  Al's  preference  for  fewer  but  larger 


routines,  as  noted  above.  With  respect  to  dispersion  comparisons, 
several  distinctions  appear  within  this  class,  but  no  overall 
interpretation  is  readily  aoparent  (except  for  a consistent 
reflection  of  a DT  < A 1 difference,  with  AT  falling  in  between, 
leaning  one  side  or  the  other). 

Class  VIII: 

Within  Class  VIII  (product  aspects  dealing  with  inter-segment 
communication  via  formal  parameters),  there  are  only  a few 
distinctions  among  the  groups.  With  respect  to  location 
comparisons,  the  total  frequency  of  parameters  and  the  average 
frequency  of  parameters  per  segment  both  show  a difference.  The 
interpretation  is  that 

the  individual  programmers  tend  to  incorporate  less 
inter-segment  communication  via  parameters,  on  the 
average,  than  either  the  ad  hoc  or  the  disciplined 
programming  teams. 

with  respect  to  dispersion  comparisons,  in  addition  to  the 
difference  in  the  raw  count  of  parameters  referred  to  in  Class  V, 
there  is  a strong  difference  in  the  variability  of  the  number  of 
ca  1 l-by-ref e rence  parameters,  also  apparent  in  the 
percenta3es-by-type-of  parameter  aspects.  The  interpretation  is 
that 

the  individual  programmers  were  more  consistent  as  a 
group  in  their  use  (in  this  case,  avoidance)  of 
reference  parameters  than  either  type  of  programming 
team . 


Class  IX: 

Within  Class  IX  (product  aspects  dealing  with  inter-segment 
communication  via  global  variables),  there  are  several  differences 
among  the  groups,  including  two  which  indicate  the  beneficial 
influence  of  the  disciplined  methodology.  This  class  is  composed 
of  aspects  dealing  with  (a)  frequency  of  globals,  (b)  average 
frequency  of  globals  per  module,  (c)  segment-global  usage  pairs 
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(frequency  of  access  paths  from  segments  to  globals),  and  (d) 
segment-g loba l-segment  data  bindings  CStevens,  Myers,  ana 
Constantine  74;  pp.  11i-1192  (frequency  of  logical  bindings 
between  two  different  segments  via  a global  variable  which  is 
mocified  by  the  first  segment  and  referenced  by  the  second). 

With  respect  to  location  comparisons,  there  is  the 
A I < AT  - DT  distinction  in  sheer  numbers  of  globals,  particularly 
glcoals  which  are  modified  during  execution,  as  noted  in  Class  V. 
However,  when  averaged  per  module,  there  appears  to  be  no 
distinction  in  the  frequency  of  globals.  The  AI  < AT  = DT 
difference  in  the  number  of  possible  segment-global  access  paths 
makes  sense  as  the  result  of  group  AI  having  both  fewer  segments 
and  fewer  globals.  All  three  groups  had  essentially  similar 
average  levels  of  actual  segment-global  access  paths,  but  several 
differences  appear  in  the  relative  percentage  (actua l-to-possib  le 
ratio)  category.  These  three  instances  of  AT  < DT  = AI 
differences  inoicate  that  the  degree  of  "globality"  for  global 
variables  was  higher  for  the  individuals  and  the  disciplined  teams 
than  for  the  ad  hoc  teams.  Finally,  another  AT  t DT  = AI 
aiffererce  appears  for  the  frequency  of  possible 

segment-global-segment  data  bindings,  indicating  a positive  effect 
of  the  disciplined  methodology  in  reducing  the  possible  data 
coupling  among  segments.  It  may  be  noted  that  these  last  two 
categories  of  aspects,  segment-global  usage  relative  percentages 
and  segment-global-segment  data  bindings,  also  reflect  upon  the 
quality  of  modu la r i za t i on , since  good  modularity  should  promote 
the  decree  of  "globality"  for  globals  and  minimize  the  data 
coupling  among  segments.  The  interpretation  here  is  that 

certain  aspects  of  inter-segment  communication  via 
globals  seems  to  be  positively  influenced,  on  the 
average,  by  the  Disciplined  methodology. 

with  respect  to  dispersion  comparisons,  there  is  a diversity 
of  cifferences  in  this  class,  without  any  unifying  interpretation 
in  terms  of  the  experimental  factors. 
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VI.  Conc±yginc  Rgmgrks 

A practical  methodology  was  designed  and  Developed  for 
experimentally  and  quantitatively  investigating  the  software 
development  phenomenon.  It  was  employed  to  compare  three 
particular  software  development  environments  and  to  evaluate  the 
relative  impact  of  a particular  disciplined  methodology  (made  up 
of  so-calted  modern  programming  practices).  The  experiments  were 
successful  in  measuring  differences  among  programming  environments 
anc  the  results  support  the  claim  that  disciplined  methodology 
effectively  improves  both  the  process  and  product  of  software 
deve loprent • 

One  way  to  substantiate  the  claim  for  imoroved  process  is  to 
measure  the  effectiveness  of  the  particular  programming 
methodology  via  the  number  of  bugs  initially  in  the  system  (i.e., 
in  the  initial  source  code)  and  the  amount  of  effort  required  to 
remove  them.  (This  criteria  was  independently  suggested  by 
Professor  W.  Shooman  of  Polytechnic  Institute  of  New  York  while 
speaking  recently  on  the  subject  of  sofware  reliability  mooels.) 
Although  neither  of  these  measures  was  directly  computed*  they  are 
each  closely  associated  with  one  of  the  process  aspects  considered 
in  the  study:  PROGRAM  CHANGES  and  ESSENTIAL  JOa  STEPS, 
respe c t i ve  ly  . The  statistical  conclusions  (on  location 
comparison)  for  both  these  aspects  affirmed  DT  < AI  = AT  outcomes 
at  very  low  (<.01 ) significance  levels*  indicating  that  on  the 
average  the  disciplined  teams  measured  lower  than  either  the  ad 
hoc  inoividuals  or  the  ad  hoc  teams  which  both  measured  about  the 
same.  Thus,  the  evidence  collected  in  this  study  strongly 
confirms  the  effectiveness  of  the  disciplined  methodology  in 
ouiloing  reliable  software  efficiently. 

The  second  claim*  that  the  product  of  a disciplined  team 
should  closely  resemble  that  of  a single  individual  since  the 
disciplineo  methodology  assures  a semblence  of  conceptual 
integrity  within  a programming  team*  was  partially  substantiated. 
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In  man/  product  aspects  the  products  developed  using  the 
disciplined  methodology  were  either  similar  to  or  tended  toward 
the  products  developed  ty  the  individuals*  In  no  case  did  any  of 
the  measures  show  the  disciplined  teams'  products  to  be  worse  than 
these  developed  by  the  aj  noc  teams.  It  is  felt  that  the 
suj. e r f i c i a l i t y of  most  of  the  product  measures  was  chiefly 
responsible  for  the  lack  of  stronger  support  for  this  second 
claim.  The  need  for  product  measures  with  increased  sensitivity 
to  critical  c ha rac t er i s t i c s of  software  is  very  clear. 


The  results  of  these  experiments  will  be  used  to  guide 
further  experiments  ano  will  act  as  a basis  for  analysis  of 
software  development  products  and  processes  in  the  Software 
Engineering  Laboratory  at  NASA/GSFC  CPasili  et  al.  773  . The 
intention  is  to  persue  this  type  of  research,  especially  extending 
the  study  to  include  more  sophisticated  and  promising  programming 
aspects,  such  as  Halstead's  software  science  Quantities  CHalstead 
77]  and  other  software  complexity  metrics  CMcCaoe  763. 
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Appendix  1.  LiBiacaifilX  !J£l£S  ifi£  Ifafi  2C2SCa2E2DS  i£B££l5 


The  following  numbered  paragraphs,  keyed  to  the  list  of 
aspects  in  Table  1,  explain  in  detail  the  programming  aspects 
considered  in  the  study*  Various  system-  or  language-dependent 
terms  ( e . g . , module*  segment,  intrinsic,  entry)  are  also  defined 
here. 

(1)  A sornguter  igb  §tgg  is  a single  activity  performed  on  a 
computer  at  the  operating  system  command  level  which  is  inherent 
to  the  development  effort  and  involves  a nontrivial  expenditure  of 
computer  or  human  resources.  Typical  job  steps  might  include  text 
editin,,  module  compilation,  program  collection  or  link-editing, 
anc  program  execution;  however,  operations  such  as  querying  the 
operating  system  for  status  information  or  requesting  access  to 
on-line  files  would  not  be  considered  as  job  steps.  In  this 
stucy,  only  module  compilations  and  program  executions  are  counted 
as  COMPUTER  JCE  STEPS. 

(2)  A modgig  fiSEBiialiSQ  is  an  invocation  of  the 
implementation  language  processor  on  the  source  code  of  an 
individual  module.  In  this  study,  only  compilations  of  modules 
comprising  the  final  software  product  (or  logical  predecessors 
thereof)  are  counted  as  COMPUTER  JOB  STEPSXMOOULE  COMPILATIONS. 

(3)  All  MODULE  COMPILATIONS  are  classified  as  either 
IDENTICAL  or  UNIQUE  depending  on  whether  or  not  the  source  code 
compiled  is  textually  identical  to  that  of  a previous  compilation. 
During  the  development  process,  each  unique  compilation  was 
necessary  in  some  sense,  while  an  identical  compilation  could  have 
been  logically  avoided  ty  saving  the  relocatable  output  of  a 
previous  compilation  for  later  reuse  (except  in  the  situation  of 
uncoing  source  code  revisions  after  they  have  oeen  tested  and 
found  to  be  erroneous  or  superfluous). 


(«,)  A B£2£r§2  exggyiion  is  an  invocation  of  a complete 
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p rcar aone r-deve loped  program  (after  the  necessary  compi la t i on ( s ) 
anc  collection  or  link-editing)  upon  some  test  data* 


(5)  A 2i$ce±i,angoy35  j.gb  ggeg  is  an  auxiliary  compilation  or 
execution  of  something  other  than  the  final  software  product. 
Only  job  steps  counted  as  COMPUTER  JOB  STEPS*  but  not  counted  as 
COMPUTER  JOB  STEPS \M0DULE  COMPILATIONS  or  COMPUTER  JOB  STEPSN 
PROGRAM  EXECUTIONS,  are  counted  as  COMPUTER  JOB  STEPSN 
MISCELLANEOUS . 


(6)  An  essen^jai  j.og  $£gg  is  a computer  job  step  which 
involves  the  final  software  product  (or  logical  predecessors 
thereof)  and  could  not  have  been  avoided  (by  off-line  computation 
or  by  on-line  storage  of  previous  compilations  or  results)*  In 
this  study*  the  number  of  ESSENTIAL  JOB  STEPS  is  the  sum  of  the 
number  of  COMPUTER  JOB  STEPSNMOOULE  COMPILATIONSNUNIQUE  plus  the 
number  of  COMPUTER  JOB  STEPSNPROGRAM  EXECUTIONS. 

(7)  The  number  of  AVERAGE  UNIQUE  COMPILATIONS  PER  MODULE  is 
simply  the  number  of  COMPUTER  JOB  STEPSNMOOULE  COMPILATIONSNUNIQUE 
divided  by  the  number  of  MODULES. 


(8)  The  number  of  MAX  UNIQUE  COMPILATIONS  F.A.O.  MODULE  is 
simply  the  maximum  number  of  unique  compilations  for  any  one 
mocule  of  the  final  software  product.  F.A.O.  stands  for  "for  any 
one".  Each  unique  compilation  is  associated  (either  directly  or 
as  a logical  predecessor)  with  a particular  module  of  the  final 
product;  their  sum  is  computed  for  each  module;  and  the  maximum  of 
the  sums  is  taken. 

(9)  The  grgsrgm  ghanges  metric  CDunsmore  and  Gannon  771  is 
defined  in  terms  of  textual  revisions  in  the  source  code  of  a 
mocule  during  the  development  period*  from  the  time  that  module  is 
first  presented  to  the  computer  system,  to  the  completion  of  the 
project.  The  rules  for  counting  program  changes  --which  are 
reproduced  below  from  the  paper  referenced  above  with  the  kind 
permission  of  the  authors — are  such  that  one  program  change 
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should  rep  re  sent  approximately  one  conceptual  change  to  the 
program. 

The  following  each  represent  a single  program  change: 

(a)  one  or  more  changes  to  a single  statement, 

(A  single  statement  in  a program  represents  a sinole 
concept  and  even  multiple  character  changes  to  that 
statement  represent  mental  activity  with  a single 
concept . ) 

(b)  one  or  more  statements  inserted  between  existing 
statements, 

(The  contiguous  group  of  statements  inserted  probably 
corresponds  to  a single  abstract  instruction.) 

(c)  a change  to  a single  statement  followed  by  the  insertion 
of  new  statements. 

(This  instance  probably  represents  a discovery  that  an 
existing  statement  is  insufficient  and  that  it  must  be 
altered  and  supplemented  in  order  to  achieve  the  single 
concept  for  which  it  was  produced.) 

However,  the  following  are  not  counted  as  program  chanaes: 

(a)  the  deletion  of  one  or  more  existing  statements, 
(Statements  which  are  deleted  must  usually  be  replaced 
with  other  statements  elsewhere.  The  inserted 
statements  are  counted;  counting  deletions  as  well  would 
give  double  weight  to  such  a change.  Occasionally 

statements  are  deleted  but  not  replaced;  these  are 
probably  being  used  for  debugging  purposes  and  their 
deletion  takes  no  great  mental  activity.) 

(b)  the  insertion  of  standard  output  statements  or  special 
compi ler-provided  debugging  direetivest 

(These  are  occasionally  inserted  in  a wholesale  fasion 
during  debugging.  When  the  problem  is  discerned,  these 
are  then  all  removed,  and  the  actual  ststement  change 
takes  place. ) 

(c)  the  insertion  of  blank  lines,  insertion  of  comments, 
revision  of  comments,  and  reformatting  without  alteration  of 
existing  statements. 

(These  are  all  judged  to  be  cosmetic  in  nature.) 

Program  changes  are  countea  automatically  according  to  a specific 
algorithm  which  symbolically  compares  the  source  code  from  each 
pair  of  consecutive  compilations  of  a particular  module  (or 
logical  predecessor  thereof).  Thus  the  total  number  of  program 
changes  is  a measure  of  the  amount  of  textual  revision  to  source 
coce  during  (postdesign)  system  development. 


(1C)  A mgdyJkfi  is  a separately  compiled  portion  of  the 
complete  software  system.  In  the  implementation  language  SIM°L-T, 
a typical  module  is  a collection  of  the  declarations  of  several 
glcoal  variables  and  the  definitions  of  several  segments.  Cln 
this  stucy,  only  those  modules  which  comprise  the  final  product 
are  counted  as  "*0DULES.] 


(11)  A is  a collection  o*  source  code  statements, 

together  with  declarations  for  the  formal  parameters  and  local 
variables  manipulated  by  those  statements,  which  may  be  invoked  as 
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an  operational  unit.  In  the  implementation  language  SIMFL-T,  a 
segment  is  either  a value-returning  fyngiifiQ  (invoked  via 
reference  in  an  expression)  or  else  a non-value-returning 

(invoked  via  the  CALL  statement) f and  recursive  segments 
are  allowea  and  fully  supported.  The  segment,  function,  and 
procedure  of  SIMPL-T  correspond  to  the  (sub)program,  function,  and 
sutroutine  of  FORTRAN,  respectively. 

(12)  The  group  of  aspects  named  SEGMENT  TYPE  COUNTS,  etc., 
gives  the  absolute  numoer  of  programmer-def ined  segments  of  each 
type.  The  group  of  aspects  named  SEGMENT  TYPE  PERCENTAGES,  etc., 
gives  the  relative  percentage  of  each  type  of  segment,  compared 
with  the  total  number  of  programmer-def ined  segments.  The  second 
group  of  aspects  is  computed  from  the  first  by  simply  dividing  by 
the  number  of  SEGMENTS,  as  a way  of  normalizing  the  segment  type 
count  s . 

(13)  Since  segment  definitions  in  the  implementation  language 
SIf'PL-T  occur  within  the  context  of  a module,  this  provias  a 
natural  way  to  normalize  (or  average)  the  raw  counts  of  segments. 
The  AVERAGE  SEGMENTS  PER  MODULE  aspect  represents  the  number  of 
segments  in  a typical  mooule.  It  is  computed  in  the  obvious  way. 

( 1 £ ) The  number  of  LINES  is  the  total  count  of  every  textual 
line  in  the  source  code  of  the  complete  final  product,  including 
comments,  compiler  directives,  variable  declarations,  executable 
statements,  etc. 

(15)  The  number  of  STATEMENTS  counts  only  the  executable 
constructs  in  the  source  code  of  the  complete  final  product. 

These  are  high-level,  structured-programming  statements,  including 
simple  statements  — such  as  assignment  and  procedure  call--  as 
well  as  compound  statements  --such  as  if-then-else  and  while-do-- 
which  have  other  statements  nested  within  them.  The 
implementation  language  SIMPL-T  allows  exactly  seven  different 
statement  types  (referrea  to  by  their  distinguishing  keyword  cr 
symbol)  covering  assignment  (:=)»  alternati on -selection  (IF, 
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CASE),  iteration  (WHILE*  EXIT),  and  procedure  invocation  (CALL* 
RETURN).  Input-output  operations  are  accomplished  via  calls  to 
certain  intrinsic  procedures. 

(16)  The  group  of  aspects  named  STATEMENT  TYPE  COUNTS,  etc., 
gives  the  absolute  number  of  executable  statements  of  each  type. 
The  group  of  aspects  named  STATEMENT  TYPE  PERCENTAGES,  etc.,  gives 
the  relative  percentage  of  each  type  of  statement,  compared  with 
the  total  number  of  executable  statements.  The  second  group  of 
aspects  is  computed  from  the  first  by  simply  dividing  by  the 
number  STATEMENTS,  as  a way  of  normalizing  the  statement  type 
counts. 

(17)  As  mentioned  aoove,  the  :=  symbol  denotes  the  assignment 
statement.  It  assigns  the  value  of  the  expression  on  the  right 
hand  side  to  the  variable  on  the  left  hand  side. 

(13)  Both  if-then  and  if-then-else  constructs  are  counted  as 
IF  statements.  Each  IF  statement  allows  the  execution  of  either 
the  then-  or  else-part  statements,  depending  upon  its  Boolean 
expression. 

(1°)  The  CASE  statement  provides  for  selection  from  several 
alternatives,  depending  upon  the  value  of  an  expression.  In  the 
implementation  language  SIMPL-T,  exactly  one  of  the  alternatives 
(or  an  optional  else-part)  is  selected  per  execution  of  a CASE,  a 
list  of  constants  is  explicitly  given  for  each  alternative,  and 
selection  is  based  upon  the  equality  of  the  expression  value  with 
one  of  the  constants.  A case  construct  with  n alternatives  is 
logically  and  semantically  equivalent  to  a certain  pattern  of  n 
nested  if-then-else  constructs. 

(2r)  The  WHILE  statement  is  the  only  iteration  or  looping 
construct  provided  by  the  implementation  language  SIMPL-T.  It 
allows  tne  statements  in  the  loop  tody  to  be  executed  repeatedly 
(zero  or  more  times)  depending  upon  a Boolean  expression  which  is 
reevaluated  at  every  iteration;  the  loop  may  also  be  terminated 
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vis  an  EXIT  statement*  each  WHILE  statement  may  be  optionally 
labeled  with  a designator  (referenced  by  EXIT  statements)  which 
uniquely  identifies  it  from  other  nested  WHILE  statements. 

(2D  The  EXIT  statement  allows  the  abnormal  termination  of 
iteration  loops  cy  unconditional  transfer  of  control  to  the 
statement  immediately  following  the  WHILE  statement.  Thus  it  is  a 
very  restricted  form  of  GOTO.  This  exiting  may  take  place  from 
any  depth  of  nested  loops,  since  the  EXIT  statement  may  optionally 
name  a designator  which  identifies  the  loop  to  be  exited;  without 
such  a designator  only  the  immediately  enclosing  loop  is  exited. 

(2?)  Since  there  are  two  types  of  segments  in  the 
implementation  language  SINPL-T,  there  are  two  types  of  "calls'*  or 
segment  invocations.  Procedures  are  invoked  via  the  CALL 
statement,  and  functions  are  invoked  via  reference  in  an 
expression.  The  counts  for  these  separate  constructs  are  reported 
separately  as  the  (PROC)CALL  and  FUNCTION  CALL  aspects,  and 
jointly  as  the  INVOCATIONS  aspect. 

(23)  intrinsic  means  provided  and  defined  oy  the 
implementation  language;  nonin£rin§i£  means  provided  and  defined 
by  the  programmer.  These  terms  are  used  to  distinguish  built-in 
procedures  or  functions  (which  are  supported  by  the  compiler  and 
utilized  as  primitives)  from  segments  (which  are  written  by  the 
programmer  himself).  Nearly  all  of  the  intrinsic  procedures 
provided  by  the  implementation  language  SIMPL-T  perform 
input-output  operations  ana  external  data  file  manipulations.  All 
of  the  intrinsic  functions  provided  by  SI*PL-T  perform  data  type 
conversions  and  character  string  manipulations. 

(24)  The  RETURN  statement  allows  the  abnormal  termination  of 
the  current  segment  by  unconditional  resumption  of  the  previously 
executing  segment.  Thus  it  is  another  very  restricted  form  of 
GOTO.  Within  a function,  a RETURN  statement  must  specify  an 
expression,  the  value  of  which  becomes  the  value  returned  for  the 
function  invocation.  within  a procedure,  a RETURN  statement  must 
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not  specify  such  an  expression.  A ddi t ional  ly , a simple  RETURN 
statement  is  optional  at  the  textual  end  of  procedures;  it  will  be 
implicitly  assumed  if  not  explicitly  coded.  Tn  this  study,  the 
total  number  of  explicitly  coded  and  implicitly  assumed  RETURN 
statements,  both  from  functions  and  procedures  combined,  is 
counted. 

(25)  The  average  STATEMENTS  PER  SEGMENT  aspect  provides  a way 
of  normalizing  the  number  of  statements  relative  to  their  natural 
enclosure  in  a program,  the  segment.  The  measure  also  represents 
the  length,  in  executable  statements,  of  a typical  segment  of  the 
program. 

(26)  In  the  implementation  language  SIMPL-T,  both  simple 
(e.g.,  assignment)  and  compound  (e.g.,  i f-then-else)  statements 
may  be  nested  inside  other  compound  statements.  A particular 
Q£§liOi  ifyei  is  associated  with  each  statement,  starting  at  1 for 
a statement  at  the  outermost  level  of  each  segment  and  increasing 
by  1 for  successively  nested  statments.  Nesting  level  can  be 
displayed  visually  via  proper  and  consistent  indentation  of  the 
souce  code  listing. 

(Zy)  The  number  of  DECISIONS  is  simply  the  sum  of  the  numbers 
of  IF,  CASE,  and  WHILE  statements  within  the  complete  source  code. 
Each  of  these  statements  represents  a unique  (possibly  repeated) 
run-time  decision  coded  by  the  programmer.  This  count  is  closely 
associated  with  a recently  proposed  complexity  metric  CMcCabe  763 
which  essentially  reflects  the  number  of  binary-branching 
decisions  represented  in  the  source  code. 

(2?)  lokgns  are  the  basic  syntactic  entities  — such  as 
ke>.oros,  operators,  parentheses,  identifiers,  etc. — that  occur 
in  a program  statement.  The  average  number  of  tokens  per 
statement  may  be  viewed  as  an  indication  of  how  much  "information*' 
a typical  statement  contains,  how  "powerful"  a typical  statement 
is,  or  how  concisely  the  statements  in  general  are  cooed. 
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(29)  An  2Qvo£flti2Q  is  simply  the  syntactic  occurrence  of  a 
construct  by  which  either  a programmer-defined  segment  or  a 
built-in  routine  is  invoked  from  within  another  segment;  both 
procedure  calls  and  function  references  are  counted  as 
INVOCATIONS.  They  are  (sub)classified  by  the  type  ( i .e  • , function 
or  procedure*  nonintrinsic  or  intrinsic)  of  segment  or  routine 
oeing  invoked. 

(30  The  group  of  aspects  named  AVG  INVOCATIONS  PER  (CALLING) 
SEGMENT,  etc.,  represents  one  way  to  normalize  the  absolute  number 
of  invocations.  These  aspects  reflect  the  numoer  of  calls  to 
programme r-def ined  segments  and  built-in  routines  from  a typical 
p rc j ra mme r-de f i ned  segment.  They  are  (sub) c las s i f ied  by  the  type 
of  segment  or  routine  being  invoked.  The  measures  for  this  group 
of  aspects  a re  computed  by  simply  dividing  each  of  the 
corresponding  measures  in  the  INVOCATIONS  aspect  group  by  the 
number  of  SEGMENTS. 

(31)  The  group  of  aspects  named  AVG  INVOCATIONS  PER  (CALLED) 
SEGMENT,  etc.,  represents  another  way  to  normalize  the  absolute 
number  of  invocations.  These  aspects  reflect  the  number  of  calls 
to  a typical  programmer-defined  segment  from  other  segments.  They 
are  ( sub) c la ssi f ied  by  the  type  (i.e.,  function  or  procedure)  o* 
segment  being  invoked. 

(3?)  A data  aS£iifei£  is  an  individually  named  scalar  or  array 
of  scalars.  In  the  implementation  language  SIMPL-T,  (a)  there  are 
three  data  iy£js  for  scalars  —integer,  character,  and  (varying 
length)  string--,  (b)  there  is  one  kind  of  data  ittutlyrj  (besides 
scalar)  — single  dimensional  array,  with  zero-origin  subscript 
rarlj(e  — , and  (c)  there  are  several  levels  of  £££&£  tas  explained 
in  note  33  below)  for  data  variables.  In  addition,  all  oata 
variables  in  a SIMPL-T  program  must  be  explicitly  declared,  with 
attributes  fully  specified.  The  number  of  DATA  VARIABLES  is 
computed  by  counting  each  of  the  data  variables  declared  in  the 
final  software  product  once,  regardless  of  type,  structure,  or 
scope.  Note  that  each  array  is  counted  as  a single  data  variable. 
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< 3 T ) In  the  implementation  language  SIPPL-T,  data  variables 
can  have  any  one  of  essentially  four  levels  o*  sccge  --entry 
global,  nonentry  global,  parameter*  and  local--  depending  on  where 
and  ho*  they  are  declarec  in  the  program.  Note  that  the  notion  of 
scope  deals  only  with  static  accessibility  by  name;  the  effective 
accessibility  of  any  variable  can  always  be  extended  by  passing  it 
as  a parameter  between  segments.  The  scope  levels  are  explained 
here  (ana  presented  in  the  aspect  (sub  ) c l ass i f i ca t i ons  ) via  a 
hierarchy  of  distinctions. 

The  primary  distinction  is  between  global  and  nonglobal. 
Global  variables  are  accessible  by  name  to  each  of  the  segments  in 
the  module  in  which  they  are  declared.  &2Q3i2£ai  variables  are 
accessible  by  name  only  to  the  single  segment  in  which  they  are 
dec  la  red . 

Global  varaibles  are  secondarily  distinguished  into  entry  and 
nonentry.  Entry  giQbals  are  actually  accessible  by  name  to  each 
of  the  segments  in  several  (two  or  more)  modules:  the  module  which 
declarec  it  ENTRY,  plus  each  of  the  modules  which  declared  it 
EXTernal  (as  explained  in  note  34  below).  No n entry  globa^s  are 
accessible  by  name  only  within  the  module  in  which  they  are 
dec  la  red . 

Nonglobal  variables  are  secondarily  distinguished  into  formal 
parameter  and  local.  Formal  garametgrs  are  accessible  by  name 
only  within  the  enclosing  (called)  segment,  but  their  values  are 
not  completely  unrelated  to  the  calling  segment  (as  explained  in 
note  36  below).  L2£ils  are  accessible  by  name  only  within  the 
enclosing  segment,  and  their  values  are  completely  isolated  from 
any  other  segment. 


(34)  Enjry  means  that  the  data  variable  Cor  segment]  is 
declared  to  be  accessible  fron  within  other  separately  compiled 
mocules  (in  which  it  must  be  explicitly  declared  as  EXTernal). 

Nor, entry  means  that  the  data  variable  Cor  segment]  is  accessible 
only  within  the  module  in  which  it  is  declared  Cor  defined].  In 
this  study  these  termi  are  used  pertaining  only  to  global 
variables.  “Entry  global"  actually  constitutes  an  extra  level  of 
scope  peyond  "nonentry  global".  CAlthough  the  implementation 
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language  SIMPL-T  does  allow  the  EXTernal  attrioute  to  be  declared 
for  local  variables  --just  the  enclosing  segment  has  access  to  a 
glcoal  declared  in  a different  module — « it  is  an  extremely 
obscure  and  rarely  used  feature;  it  never  occurred  in  any  of  the 
final  software  products  examined  in  this  study.3 

(35)  Mg^ifijd  means  referred  to*  at  least  once  in  the  program 
source  code*  in  such  a manner  that  the  value  of  the  data  variable 
would  be  (re)set  when  (and  if)  the  appropriate  statements  were  to 
be  executed*  Data  variables  can  be  (re)set  only  by  (a)  being  the 
"target"  of  an  assignment  statement*  (b)  being  passed  by  reference 
to  some  programmer-def ined  segment  or  built-in  routine*  or  (c) 
being  named  in  an  "input  statement.”  This  third  case  is  really 
covered  by  the  second  case  since  all  the  "input  statements"  in 
SIPPL-T  are  actually  calls  to  certain  intrinsic  procedures  wit 
passed-by-ref erence  parameters*  y nmodj f led  means  referred  t 
throughout  the  program  source  code*  in  such  a manner  that  the 
value  of  the  data  variable  could  never  be  (re)set  during 
execution*  These  terms  are  used  pertaining  to  global  data 
variables;  any  global  variable  is  allowed  to  have  an  initial  value 
(constants  only)  specified  in  its  declaration*  Globals  which  are 
initialized  but  UNMODIFIED  are  particularly  useful  in  SIPIPL-T 
programs*  serving  as  "named  constants." 


(56)  The  inplementation  language  SIMPL-T  allows  two  types  of 
parameter  passage.  Pass-by-jfaius  means  that  the  value  of  t.:e 
actual  argument  is  simply  copied  (upon  invocation)  into  the 
corresponding  formal  parameter  (which  thereafter  behaves  like  a 
local  variable  for  all  intents  and  purposes)*  with  the  effect  that 
the  called  routine  cannot  modify  the  value  of  the  calling 
segment's  actual  argument.  Pass-by-££l£££Qtj  means  that  the 
adaress  of  the  actual  argument  --which  must  be  a variaole  rather 
than  an  expression--  is  passed  (upon  invocation)  to  the  called 
routine*  with  the  effect  that  any  changes  made  by  the  called 
routine  to  the  corresponding  formal  parameter  will  be  reflected  in 
the  value  of  the  calling  segment's  actual  argument  (upon  return). 
In  SIMPL-T,  formal  parameters  which  are  scalars  are  normally 
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(default)  passed  by  value,  but  they  may  be  explicitly  declared  to 
be  passed  by  reference;  formal  parameters  which  are  arrays  are 
always  passed  by  reference* 

(27)  The  group  of  aspects  named  DATA  variable  SCOPE  COUNTS, 
etc.,  gives  the  absolute  number  of  declared  data  variables 
according  to  each  level  of  scope*  The  group  of  aspects  named  data 
VARIABLE  SCOPE  PERCENTAGES,  etc*,  gives  the  relative  percentage  of 
variables  at  each  scope  level,  compared  with  the  total  number  of 
declared  variables*  The  second  group  of  aspects  is  computed  from 
the  first  by  simply  dividing  by  the  number  of  DATA  VARIABLES,  as  a 
way  of  normalizing  the  data  variable  scope  counts. 

(3!)  Since  data  variable  declarations  in  the  implementation 
language  S1MPL-T  may  only  appear  in  certain  contexts  within  the 
pregram  — globals  in  the  context  of  a module  and  and  nonglobals  in 
the  context  of  a segment--,  this  provides  a natural  way  to 
normalize  (or  average)  the  raw  counts  of  data  variables.  The 
greup  of  aspects  named  AVERAGE  GLOBAL  VARIABLES  PER  MODULE , etc., 
represent  the  number  of  globals  declared  for  a typical  module. 

They  are  computed  by  simply  dividing  each  of  the  corresponding  raw 
counts  of  global  data  variables  by  the  number  of  MODULES.  The 
group  of  aspects  named  AVERAGE  NONGLOBAL  VARIABLES  PER  SEGMENT, 
etc.,  represent  the  number  of  nonglobals  declared  for  a typical 
segment.  They  are  computed  by  simply  dividing  each  of  the 
corresponding  raw  counts  of  nonglobal  data  variables  by  the  number 
of  SEGMENTS. 

(39)  since  there  are  two  types  of  parameter  passing 
mecnanisms  in  the  implementation  language  S1MPL-T  (as  explained  in 
note  36  above),  it  is  desirable  to  normalize  their  raw  frequencies 
into  relative  percentages,  indicating  the  programmer's  degree  o' 
"pref erence"  for  one  type  or  the  other.  The  croup  of  aspects 
named  PARAMETER  PASSAGE  TYPE  PERCENTAGES,  etc.,  gives  the 
percentages  of  each  type  of  parameter  relative  to  the  tctal  number 
of  parameters  declared  in  the  program.  They  are  conputeo  ir.  ;:»e 
obvious  way. 
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(*.0)  A segment-9  lota  l yjagg  2air  (p,r)  is  simply  an  instance 
of  a glooal  variable  r being  used  by  a segment  p (i.e.,  the  global 
is  either  modified  (set)  or  accessed  (fetched)  at  least  once 
within  the  statements  of  the  segment)*  Each  usage  pair  represents 
a unique  "use  connection"  between  a global  and  a segment.  Usage 
pairs  are  ( sub) c las  si f i ea  by  the  type  (i.e.*  entry  or  nonentry* 
modified  or  unmodified)  of  global  data  variable  involved. 

In  this  study*  segment-global  usage  pairs  were  counted  in 
three  different  ways.  First*  the  (SEG*SlOBAL)  ACTUAL  USAGE  PAlo 
counts  are  the  absolute  numbers  of  true  usage  pairs  (p*r):  the 
global  variable  r is  actually  used  by  segment  p.  They  represent 
the  true  frequencies  of  use  connections  within  the  program. 

Second*  the  (SEG*GLOBAL)  POSSIBLE  USAGE  PAIR  counts  are  the 
absolute  numbers  of  potential  usage  pairs  (p*r)*  given  the 
program's  global  variables  and  their  declared  scope:  the  scope  of 
global  variable  r simply  contains  segment  p*  so  that  segment  p 
coula  potentially  modify  or  access  r.  These  counts  of  possible 
usage  pairs  are  computed  as  the  sum  of  the  number  of  segments  in 
each  global's  scope.  They  represent  a sort  of  "worst  case" 
frequencies  of  use  connections.  Third*  the  (SEG* GLOBAL)  USAGE 
RELATIVE  PERCENTAGE  counts  are  a way  of  normalizing  the  number  of 
usage  pairs  since  these  measures  are  simply  the  ratios  (expressed 
as  percentages)  of  actual  usage  pairs  to  possible  usage  pairs. 

They  represent  the  frequencies  of  true  use  connections  relative  to 
potential  use  connections.  These  usage  pair  relative  percentage 
metrics  are  empirical  estimates  of  the  likelihood  that  an 
arbitrary  segment  uses  (i.e.*  sets  or  fetches  the  value  of)  an 
arbitrary  global  variable. 

(<*1)  A segment-global-segment  jjjia  feinding  (p*r*q)  is  an 
occurrence  of  the  following  arrangement  in  a program  CStevens* 
Myers,  and  Constantine  7 43:  a segment  p modifies  (sets)  a global 
variable  r which  is  also  accessed  (fetched)  by  a segment  q«  with 
segment  p different  from  segment  q.  The  existence  of  a data 
binding  (p*r*q)  indicates  that  the  behavior  of  segment  q is 
prooably  dependent  on  the  performance  of  segment  p because  of  the 
data  variable  r*  whose  value  is  set  by  p and  used  by  q.  The 
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binding  (p,r,q)  is  different  fro*  the  binding  (q,r,p)  which  may 
also  exist;  occurrences  such  as  (p,r,o)  are  not  counted  as  data 
bindings*  Thus  each  (S£G,GLOPAL,SEG)  DATA  BINDING  represents  a 
unique  communication  path  between  a pair  of  segments  via  a global 
variable.  The  total  number  of  (SEG, GLOBAL ,SEG)  DATA  BINDINGS 
reflects  the  degree  of  a certain  kind  of  "connectivity"  (i.e., 
between  segment  pairs  via  globals)  within  a complete  program. 

(42)  In  this  study,  segment-g loba l -segment  data  bindings  were 
counted  in  three  different  ways.  first,  the  ACTUAL  count  is  the 
absolute  number  of  true  data  bindings  (p,r,q):  the  global  variable 
r is  actually  modified  by  segment  p and  actually  accessed  by 
segment  q.  It  represents  the  degree  of  true  connectivity  in  the 
program.  Second,  the  POSSIBLE  count  is  the  absolute  number  of 
potential  data  bindings  (p,r,q),  given  the  program's  global 
variables  and  their  declared  scope:  the  scope  of  global  variable  r 
simply  contains  both  segment  p and  segment  q,  so  that  segment  p 
could,  potential ly  modify  r and  segment  q could  potentially  access 
r.  This  count  of  POSSIElE  data  bindings  is  computed  as  the  sum  of 
terms  s*(s-1)  for  each  global,  where  s is  the  number  of  segments 
in  that  global's  scope;  thus*  it  is  fairly  sensitive  (numerically 
speaking)  to  the  total  number  of  SEGMENTS  in  a program.  It 
represents  a sort  of  "worst  case"  degree  of  potential 
connectivity.  Third,  the  RELATIVE  PERCENTAGE  is  a way  of 
normalizing  the  number  of  data  bindings  since  it  is  simply  the 
quotient  (expressed  as  a percentage)  of  the  actual  data  bindings 
divided  by  the  possible  data  bindings.  It  represents  the  degree 
of  true  connectivity  relative  to  potential  connectivity. 

(4T)  Actual  data  bindings  are  (sub)  classified  as 
"sub f unc t i ona l"  or  "independent"  depending  on  the  invocation 
relationship  between  the  two  segments.  A data  binding  (p,r,q)  is 
SU&lUD&iiSQdl  if  either  of  the  two  segments  p or  q can  invoke  t»"e 
other,  whether  directly  or  indirectly  (via  a chain  of  intermediate 
invocations  involving  other  segments).  In  this  situation,  the 
function  of  the  one  segment  may  be  viewed  as  contributing  to  the 
overall  function  of  the  other  segment.  A data  binding  (p,r,c)  is 
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iCStBSO^eni  if  neither  of  the  two  segments  p or  q can  invoke  the 
other*  whether  directly  or  indirectly.  The  transitive  closure  of 
the  call  graph  among  the  segments  of  a program  is  employed  to  make 
this  distinction  between  subfunctional  and  independent. 

(44)  There  exist  several  instances  of  duplicate  programming 
aspects  in  the  Taole  1 listing.  That  is*  certain  logically  unicue 
aspects  appear  a second  time  with  another  name*  in  order  to 
provide  alternative  views  of  the  same  metric  and  to  achieve  a 
certain  degree  of  completeness  within  a set  of  related  aspects. 

For  example,  the  FUNCTION  CALLS  aspect  and  the  STATEMENT  TYPE 
C OLNTS  X (PROC)CALL  aspect  are  listed  (and  categorized  appropr i at e l y ) 
from  tne  viewpoint  of  the  various  type  of  constructs  which 
comprise  the  the  implementation  language.  Bu*  the  very  same 
metrics  can  be  considered  from  the  unifying  viewpoint  of  the 
various  subtype  frequencies  for  segment  invocations,  and  thus  it 
is  desirable  to  include  the  duplicate  aspects  INVOCATIONSV 
FUNCTIONS  and  INVOC ATIONSXPROCEDUR ES  as  part  of  the  natural 
categorization  of  INVOCATIONS.  Listed  below  are  the  pairs  of 
duplicate  programming  aspects  that  were  considered  in  this  study: 

1.  FUNCTION  CALLS 
<=>  INVOCATIONSXfUNCTICN 

2.  FUNCTION  CALLSXN0N1NTRINSIC 
<=>  IN VOC ATI ONSXFUNCT I ON XNON INTRINSIC 

3.  FUNCTION  CALLSWNTRINSIC 
<=>  INVOCATIONSXFUNCTIONXINTRINSIC 

4.  STATEMENT  TYPE  COUNTS X (PRO C ) C A LL 
<=>  INVOCATIONSXPROCEDURE 

5.  STATEMENT  TYPE  CUUNTSX  (PROC)CALLXNONINTRINSIC 
<=>  I N VC C ATIONSXPROCEDUR EX NON  INTRINSIC 

6.  STATEMENT  TYPE  COUNT S X (PROC) C ALLX INTR I NSI C 


<=>  IN VOC ATIONSXPROCEDUR EX  INTRINSIC 

7.  AVG  INVOCATIONS  PER  (CALLING)  SEGMENTXNONINTRINSIC 
<=>  AVG  INVOCATIONS  PER  (CALLED)  SEGMENT 
2y  definition,  the  data  scores  obtained  for  any  pair  of  duplicate 
aspects  will  be  indentical,  and  thus  the  same  statistical 
conclusions  will  be  reached  for  both  aspects. 
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Appendix  2.  iDjlijfj  Statements  f 0£  jhe  Ncn;Nyt±  £2D£iySl2D2 


The  following  nuxDered  sentences  simply  provide  English 
translations  for  the  non-null  i2£2il2D  comparisons  presented  in 
symoolic  equation  form  in  Table  2*1*  They  may  be  skimmed  by  the 
reaoer  since  they  do  not  add  to  the  information  appearing  in  the 
table* 

(7)  According  to  the  SEGMENTS  aspect*  the  individuals  (AI) 

organized  their  software  into  noticeably  fewer  routines 
(i.e.,  functions  or  procedures)  than  either  the  ad  hoc  teams 
(AT)  or  the  disciplined  teams  (DT). 

(?)  Goth  the  ad  hoc  teams  (AT)  and  the  disciplined  teams  (DT) 

declared  a noticeably  larger  number  of  data  variables  (i.e.» 
scalars  or  arrays  of  scalars)  than  the  individuals  (Al)v 
according  to  the  OATA  VARIABLES  aspect* 

(3)  In  particular  a definite  trend  toward  this  same  difference 

was  apparent  in  the  number  of  global  variables*  the  number  of 
global  variables  whose  values  could  be  modified  during 
execution*  and  the  number  of  formal  parameter  variables* 
according  to  the  OATA  VARIABLE  SCOPE  COUNTSXGLOB AL , OATA 

t 

VARIABLE  SCOPE  COUNTS \GL0BAL\ MODIFIES*  and  DATA  VARIABLE 
SCOPE  COUNTS\NONGLOBAL\PAfiAMETER  aspects,  respectively. 

(4)  A trend  existed  for  the  individuals  ( A I ) to  have  a smaller 

percentage  of  formal  parameters  compared  to  the  total  number 
of  declared  data  variables  than  either  the  ad  hoc  teams  (AT) 
or  the  disciplined  teams  (DT),  according  to  the  DATA  VARIABLE 
SCOPE  PERCENTAGESNNONGLORALXPARAMEtep  asoect. 

(5)  According  to  the  AVERAGE  NONGLCB A L VARIABLES  PER  SEGMENTX 

PARAMETER  aspect*  there  was  a trend  for  the  individuals  (AT) 
to  have  fewer  formal  parameters  per  routine  than  did  either 
the  ad  hoc  teams  (AT)  or  the  disciplined  teams  (DT). 

(6)  A definite  trend  existed  for  the  individuals  (AI)  to  have 

fewer  possible  segment -g loba l usage  pairs  (i.e.*  potential 
access  of  a global  variable  by  a routine)  than  either  the  ad 
hoc  teams  (AT)  or  the  disciplined  teams  (DT),  according  to 
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the  (SEG.6L09AL)  POSSIBLE  USAGE  PAIRS  aspect. 

(7)  According  to  the  AVERAGE  STATEMENTS  PER  SEGMENT  aspect,  the 
individuals  (AI)  displayed  a trend  toward  having  a greater 
number  of  statements  per  routine  than  did  either  the  ad  hoc 
teams  (AT)  or  the  disciplined  teams  (OT). 

(3)  There  existed  slight  trends  toward  more  calls  to 

p rog ramme r-oef ined  routines  per  calling  routine  and  per 
called  routine  for  the  individuals  (AI)  than  for  either  the 
aa  hoc  teams  (AT)  or  the  disciplined  teams  (0T)f  according  to 
the  AVG  INVOCATIONS  PER  (CALLING)  SEGNENT\NONINTRINSIC  and 
AVG  INVOCATIONS  PER  (CALLED)  SEGMENT  aspects. 

(9)  In  addition,  a very  slight  trend  existed  for  the  individuals 

(AI)  to  have  more  calls  to  progr ammer-def ined  functions, 
averaged  per  programmer-def ined  function,  than  either  the  ad 
hoc  teams  (AT)  or  the  disciplined  teams  (OT),  according  to 
the  AVG  INVOCATIONS  PER  (CALLED)  SEG*ENT\ FUNCTION  aspect. 

(10)  According  to  the  DATA  VARIABLE  SCO^E  PERCENTAGESNNONGLOB AL \ 
LOCAL  aspect,  the  individuals  (AI)  had  a larger  percentage  of 
local  variables  compared  to  the  total  number  of  declared  data 
variables  than  either  the  ad  hoc  teams  (AT)  or  the 
disciplined  teams  (DT). 

(11)  A slight  trend  existed  for  both  the  individuals  (AI)  and  the 
disciplined  teams  (DT)  to  have  a larger  relative  percentage 
of  segment-global  usage  pairs  (i.e.,  the  ratio  of  actual 

.4 

segment-g looa l usage  pairs  to  possible  segment-global  usage 
pairs)  than  the  ad  hoc  teams  (AT)  for  nonentry  global 
variables  whose  values  were  not  modified  during  execution 
(i.e.,  the  simplest  kind  of  "named  constants"),  according  to 
the  (SEG, GLOBAL)  USAGE  RELATIVE  PERC ENT AGE SXNONENTRY \ 
UNMODIFIED  aspect. 

(II)  According  to  the  STATEMENT  TYPE  COUNTSMf  and  STATEMENT  TYPE 
PERC ENTAGE \ I F aspects,  both  the  individuals  (AI)  and  the 
disciplined  teams  (DT)  coded  noticeably  fewer  IF  statements 
than  the  ad  hoc  teams  (AT),  in  terms  of  both  total  number  and 
percentage  of  total  statements. 

(ID  A trend  existed,  according  to  the  STATEMENT  TYPE  COUNTSV 

(PROC)CALLMNTRINSIC  aspect,  for  the  ad  hoc  teams  (AT)  to  make 
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a larger  numoer  of  calls  on  intrinsic  proceoures  (i.e., 
built-in  language-provided  routines  primarily  for 
input-output)  than  either  the  individuals  (AI)  or  the 
disciplined  teams  (DT). 

(14)  Accoroing  to  the  STATEMENT  TYPE  COUNTS \R ETURN  aspect,  the  ao 
hoc  teams  (AT)  had  a noticeably  larger  number  of  RETURN 
statements  than  either  the  individuals  (*I)  or  the 
disciplined  teams  (DT). 

(15)  According  to  the  DECISIONS  aspect,  both  the  individuals  (AI) 
and  the  disciplined  teams  (DT)  tended  to  code  fewer  decisicns 
(i.e.,  IF,  mHI LE , or  CASE  statements)  than  the  ad  hoc  teams 
(AT)  . 

(16)  A trend  existed  for  the  ad  hoc  teams  (AT)  to  have  more  calls 
to  intrinsic  procedures,  with  a noticeably  larger  number  of 
calls  to  intrinsic  routines  (i.e.,  built-in  language-provided 
procedures  and  functions,  primarily  for  input-output  and  type 
conversion),  than  either  the  individuals  (AI)  or  the 
disciplined  teams  (DT),  according  to  the  INVOCATIONSX 
PROCEDUREMNTRINSI  C and  I NVO C AT  I ON S \ INTR I NS  I C aspects, 
respectively. 

(17)  According  to  the  ( S EG , GLOBAL , SEG  ) DATA  B I ND ING S \ PO S S I BLE 
aspect,  there  was  a slight  trend  for  both  the  inoividuals 
(AI)  and  the  disciplined  teams  (DT)  to  have  fewer  possible 
data  bindings  CStevens,  Myers,  and  Constantine  74]  (i.e., 
occurrences  of  the  situation  where  a global  variable  r is 
both  potentially  modified  by  a segment  p and  potentially 
accessed  by  a segment  q,  with  p different  from  q)  than  the  ad 
hoc  teams  (AT). 

(IE)  According  to  the  COMPUTER  JOB  STEPS  aspect,  the  disciplined 
teams  (DT)  required  very  noticeably  fewer  computer  job  steps 
(i.e.,  moaule  compilations,  program  executions,  or 
miscellaneous  job  steps)  than  both  the  individuals  (AI)  and 
tne  ad  hoc  teams  (AT). 

(19)  This  same  difference  was  definitely  apparent  in  the  total 
number  of  module  compilations,  the  number  of  unique  (i.e., 
not  an  identical  recompilation  of  a previously  compiled 
module)  module  compilations,  the  number  of  program 
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executions*  and  the  number  of  essential  job  steps  (i.e., 
unique  module  compilations  plus  program  executions)* 
according  to  the  COMPUTER  JOB  STEPSXMODULE  COMPILATIONS  * 
COMPUTER  JOB  STEPSXMODULE  COMPILATIONSXUNIQUE , COMPUTER  JOB 
STEPS\PRC6RAM  EXECUTIONS,  and  ESSENTIAL  JOB  STEPS  aspects, 
respec  t i ve  ly  . 

(20  A trend  existed  for  both  the  individuals  (AI)  and  the  ad  hoc 
teams  (AT)  to  have  required  more  miscellaneous  job  steps 
(i.e.»  auxiliary  compilations  or  executions  of  something 
other  than  the  final  software  product)  than  the  disciplined 
teams  (DT),  according  to  the  COMPUTER  JOB  STEPSXMI SCELLANEOUS 
aspect • 

(21)  According  to  the  AVERAGE  UNIQUE  COMPILATIONS  PER  MODULE  and 
MAX  UNIQUE  COMPILATIONS  F.A.O.  MODULE  aspects,  respectively, 
the  disciplined  teams  (DT)  required  fewer  unique  compilations 
per  module  on  the  average,  with  a definite  trend  toward  fewer 
unique  compilations  for  any  one  module  in  the  worst  case, 
than  either  the  individuals  (AI)  or  the  ad  hoc  teams  (AT). 

(22)  Accoraing  to  the  LINES  aspect,  there  was  a definite  trend  for 
the  individuals  (AI)  to  have  produced  fewer  total  symbolic 
lines  (includes  comments,  compiler  directives,  statements, 
declarations,  etc.)  than  the  disciplined  teams  (DT)  who 
produced  fewer  than  the  ad  hoc  teams  (AT). 

(22)  A definite  trend  existed  for  the  individuals  (AI)  to  have  a 
larger  relative  percentage  of  segment-global  usage  pairs  for 
entry  globals  and  for  entry  globals  which  could  be  modified 
during  execution  than  the  disciplined  teams  (DT)  who  had  a 
larger  still  percentage  than  the  ad  hoc  teams  (AT),  according 
to  (SEG, GLOBAL)  USAGE  RELATIVE  P ER C ENT AGES \ENTRY  and 
(SEG, GLOBAL)  USAGE  RELATIVE  P ER C ENTAGE S X ENTR YXMOD I F I ED 
aspects,  respectively. 

(24)  Accoraing  to  the  PbJGRAM  CHANGES  aspect,  there  existeo  a 

trend  for  the  disciplined  teams  (DT)  to  require  fewer  textual 
revisions  to  build  and  debug  the  software  than  the 
individuals  (AI)  who  required  fewer  revisions  than  the  ad  hoc 
teams  (AT) . 
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The  following  numberea  sentences  simoly  provide  English 
translations  for  the  non-null  ^iSBgrjign  conclusions  presented  in 
symbolic  equation  form  in  Table  2.2.  They  may  be  skimmed  by  the 
reaaer  since  they  do  not  add  to  the  information  appearing  in  the 
table. 

(1)  The  individuals  (Al)  displayed  noticeably  less  variation  in 

the  number  of  formal  parameters  passed  by  reference  than  both 
the  ad  hoc  teams  (AT)  and  the  disciplined  teams  (DT)t  with  a 
similar  trend  in  the  percentage  of  reference  parameters 
compared  to  the  total  number  of  declared  data  variable* 
according  to  the  DATA  VAR I ABLE  SCOPE  C0UNTS\N0NGL0BAL\ 
PARAMETERXREFERENCE  and  DATA  VARIABLE  SCOPE  PERCENTAGESX 
NONGLOB A LXPARAMETERXREFE PENCE  aspects. 

(2)  According  to  the  PARAMETER  PASSAGE  TYPE  PERCENTAGESWALUE  and 

PARAMETER  PASSAGE  TYPE  PERCENTAGESXREFERENCE  aspects,  both 
the  ad  hoc  teams  (AT)  and  the  disciplined  teams  (DT)  tended 
to  have  more  variation  in  the  percentage  of  value  parameters 
and  reference  parameters  compared  with  the  total  number  of 
formal  parameters  declared  than  the  individuals  (AI). 

(3)  The  individuals  (AI)  had  less  variation  in  the  number  o* 

possible  segment-g loba l usage  pairs  (i.e.*  potential  access 
of  a global  variable  by  a routine)  involving  nonentry  globals 
than  either  the  ad  hoc  teams  (AT)  or  the  disciplineo  teams 
(DT),  according  to  the  (SEG, GLOBAL)  POSSIBLE  USAGE  PAIRSX 
NONENTRY  aspect. 

(A)  According  to  the  ( SEG , GL09AL , S EG)  DATA  SINDINGS\ACTUAL\ 

INDEPENDENT  aspect,  there  was  a very  slight  trend  for  the 
individuals  (AI)  to  have  less  variation  in  the  number  of 
actual  data  bindings  CSt*»vens,  Myers,  and  Constantine  743 
(i.e.,  occurrences  of  the  situation  where  a global  variable  r 
is  both  actually  modified  by  a segment  p and  actually 
accessed  by  a segment  q,  with  p different  from  q)  in  which 
the  two  routines  were  "independent"  (i.e.,  neither  segment 
can  invoke  the  other,  directly  or  indirectly)  than  both  the 
ad  hoc  teams  (AT)  and  the  disciplined  teams  (DT). 

(5)  The  individuals  (AI)  exhibited  noticeably  greater  variation 
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than  either  the  ad  hoc  teams  (AT)  or  the  disciplined  teams 
(OT)  in  the  number  of  miscellaneous  job  steps  (i.e., 
auxiliary  compilations  or  executions  of  something  other  than 
the  final  software  project),  according  to  the  COMPUTER  JOB 
STEPSNMlSCELLANEOuS  aspect. 

(6)  In  the  number  of  calls  in  general  and  of  calls  to 

p rcg ramae r-def ined  routines  in  particular,  the  individuals 
(AI)  displayed  noticeably  greater  variation  than  both  the  ad 
hoc  teams  (AT)  and  the  disciplined  teams  (OT),  according  to 
the  INVOCATIONS  and  I NVO C AT  IONS \NONI NTR I NS  I C aspects. 

(?)  According  to  the  STATEMENTS  aspect,  a very  slight  trend 

existed  for  the  ad  hoc  teams  (AT)  to  show  less  variability 
than  either  the  disciplined  teams  (OT)  or  the  individuals 
(AI)  in  the  number  of  executable  statements. 

(8)  A trend  existed  for  both  the  individuals  (AI)  and  the 

disciplined  teams  (DT)  to  have  greater  variability  than  the 
ad  hoc  teams  (AT)  in  the  average  (per  function)  number  of 
calls  to  programmer-defined  functions,  according  to  the  AVG 
INVOCATIONS  PER  (CALLED)  SEGMENTN FUNCT ION  aspect. 

(9)  According  to  the  (SEG, GLOBAL)  ACTUAL  USAGE  PAI R S \MOD I F X E D 

aspect,  a definite  trend  existed  for  the  ad  hoc  teams  (AT)  to 
have  less  variaoility  than  either  the  individuals  (AI)  or  the 
disciplined  teams  (DT)  in  the  number  of  actual  segment-global 
usage  pairs  (i.e.,  actual  access  of  a global  variable  by  a 
routine)  involving  globals  which  were  modified  during 
execution. 

(10)  According  to  the  AVERAGE  SEGMENTS  PER  MODULE  aspect,  the 
individuals  (AI)  and  the  disciplined  teams  (DT)  both 
exhibited  noticeably  less  variation  in  the  average  number  of 
routines  per  module  than  the  ad  hoc  teams  (AT). 

(11)  The  ad  hoc  teams  (AT)  were  noticeably  more  variable  than 
either  the  disciplinec  teams  (DT)  or  the  individuals  (* I)  in 
the  percentage  of  coded  RETURN  statements  compared  with  the 
total  number  of  statements,  accoroing  to  the  STATEMENT  TYPE 
PERCENTAGESVRETURN  aspect. 

(12)  According  to  the  AVERAGE  GLOBAL  VARIABLE  PER  MODULENMODI FI  ED 
ascect,  the  ad  hoc  teams  (AT)  displayed  a definite  trend 
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toward  greater  variability  than  both  the  individuals  (AI)  and 
the  disciplined  teans  (DT)  in  the  average  number  of  glofcals 
per  module  which  were  modified  during  execution. 

(12)  The  individuals  (AI)  and  the  disciplined  teams  (DT)  were  both 
noticeably  less  variaole  than  the  ad  hoc  teams  (AT)  in  the 
numoer  of  possible  segment-global  usage  pairs  where  the 
global  variable  was  nonentry  and  modified  during  execution, 
according  to  the  (SEG,  GLOBAL)  POSSIBLE  USAGE  PA  I RSXNONE NT R Y\ 
MODIFIED  aspect. 

(14)  According  to  the  (SEG, GLOBAL, SEG)  DATA  BINOINGSXPOSSIBLE 
aspect,  the  ad  hoc  teams  (AT)  tended  toward  greater 
variability  than  either  the  individuals  (AI)  or  the 
disciplined  teams  (DT)  in  the  number  of  possible  data 
bindings. 

(15)  According  to  the  STATEMENT  TYPE  COUNTSV (PROC)CALL,  STATEMENT 
TYPE  COUNTS\(PROC)CALL\NONINTRINSIC.  I NVOC ATIONS \ PROC EDURE . 
and  INVOCATlONS\PROCEDURE\NCNINTRINSIC  aspects,  both  the 
individuals  (AI)  and  the  ad  hoc  teams  (AT)  were  noticeably 
more  variable  than  the  disciplined  teams  (DT)  in  the  number 
of  calls  to  intrinsic  and  nonintrinsic  procedures,  with  a 
similar  trend  for  calls  to  nonintrinsic  procedures  alone. 

(16)  This  same  difference  appeared  in  the  average  number  of 
intrinsic  procedure  calls  per  calling  segment,  according  to 
the  AVG  INVOCATIONS  PER  (CALLING)  SEGMENT\PROCEDURE\INTRINSIC 
aspect . 

(17)  According  to  the  DATA  VARIABLES  SCOPE  PERCENTAGESXGLOBALV 
NONENTRYNMODIFIED  aspect,  the  disciplined  teams  (DT)  displayed 
noticeably  smaller  variation  than  either  the  Individuals  (*I) 
or  the  ad  hoc  teams  (AT)  in  the  percentage  of  nonentry  global 
variables  that  were  modified  during  execution  compared  to  the 
total  number  of  data  variables  declared. 

(It)  According  to  the  AVERAGE  TOKENS  PER  STATEMENT  aspect,  a 
definite  trend  existed  for  the  disciplined  teams  (DT)  to 
exhibit  greater  variaoility  in  the  average  number  of  tokens 
(i.e.t  basic  symbolic  units)  per  statement  than  both  the 
individuals  (AI)  and  the  ad  hoc  teams  (AT). 

(1?)  The  trend  toward  less  variation  among  both  the  individuals 
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(AI)  and  the  ad  hoc  teams  (AT)  than  among  the  disciplined 
teams  (DT)  existed  in  the  number  of  global  variables  and  in 
the  number  of  formal  parameters,  according  to  the  DATA 
VARIABLE  SCOPE  COUNTS \GL03AL  and  DATA  VARIABLE  SCOPE  COUNTSX 
N0N6LQ9 AL VPARAMETER  aspects,  respectively* 

(20  A similar  difference  in  variability  existed  noticeably  in  the 
percentages,  compared  to  the  total  number  of  declared  data 
variables,  of  globals,  of  nonglobals,  of  formal  parameters, 
and  of  formal  parameters  passed  by  value*  according  to  the 
DATA  VARIABLE  SCOPE  PER C ENT AGES \ GLOB AL * DATA  VARIABLE  SCOPE 
PERCENTAGESXNONGLOBAL,  DATA  VARIABLE  SCOPE  PERCENTAGESN 
NONGLOBALXPARAMETER , and  DATA  VARIABLE  SCOPE  PERCENT  AGE  S \ 
N0NGL09AL\PARAMETER\VALUE  aspects,  respect i ve  ly « 

(21)  According  to  the  (SEG, GLOBAL)  POSSIBLE  USAGE  PAIRS  and 
(SE6, GLOBAL)  POSSIBLE  USAGE  PAI R S \NONENTR Y \UNMOD I F I ED 
aspects*  there  was  a noticeable  difference  in  variability* 
with  the  individuals  (AI)  less  than  the  disciplined  teams 
(DT)  less  than  the  ad  hoc  teams  (AT),  for  the  total  number  of 
possible  segment-global  usage  pairs*  with  a similar  trend  for 
possible  usage  pairs  in  which  the  global  variable  was 
nonentry  ana  not  modified  during  execution* 

(22)  There  was  a noticeable  difference  in  variability,  with  the 
disciplined  teams  (dT)  less  than  the  individuals  (AI)  less 
than  the  ad  hoc  teams  (AT),  in  the  maximum  number  of  unique 
compilations  for  any  one  module*  according  to  the  PAX  UNIQUE 
COMPILATIONS  F.A.O.  MODULE  aspect. 

(22)  According  to  the  STATEMENT  TYPE  COUNTSVRETURN  aspect,  there 
was  a difference  in  variability,  with  the  disciplined  teams 
(DT)  less  than  the  individuals  (AI)  less  than  the  ad  hoc 
teams  (AT),  tor  the  number  of  RETURN  statements  coded* 
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The  following  two  paragraphs  simply  provide  an  English 
paraphrase  of  the  "relaxed  differentiation"  details  presented  in 
Tatles  4.1  and  4.2*  respectively. 

On  12221120  comparisons*  four  programming  aspects  yielded 
completely  differentiated  conclusions.  They  are  "relaxed"  to 
partially  differentiated  conclusions  as  follows: 

1.  From  OT  < A I < AT  on  PROGRAM  CHANGES,  the  DT  < AI  = AT 

conclusion  o ve rwhe Im i ng l y dwarfs  the  DT  * AI  < AT  conclusion 

2.  The  OT  < AT  difference  is  more  pronounced  than  the  AI  < DT 

difference  from  AI  < OT  < AT  on  LINES 

3.  AT  < DT  < AI  on  (SEG, GLOBAL)  USAGE  RELATIVE  PERCENTAGESVENTR Y 

is  more  significantly  "relaxed"  to  AT  < OT  = AI  than  to 
AT  = DT  < AI  

4.  The  AT  < DT  and  DT  < AI  differences  from  AT  < DT  < AI  on 

(SEG, GLOBAL)  USAGE  RELATIVE  P ERC ENTA GE S \ E NT R Y\M0D I F I ED  are 
both  exactly  equally  strong 

On  222&££2220  comparisons,  three  programming  aspects  yielded 
completely  differentiated  conclusions.  They  are  "relaxed"  to 
partially  differentiated  conclusions  as  follows: 

1.  The  DT  < AI  difference  is  much  more  pronounced  than  the 

AI  < AT  difference  from  DT  < AI  < AT  on  MAX  UNIQUE 
COMPILATIONS  F.A.O.  MODULE 

2.  From  DT  < AI  < AT  on  STATEMENT  TYPE  COUNTSNRETURN,  the 

DT  = AI  < AT  conclusion  overwhelmingly  dwarfs  the 
DT  < AI  = AT  conclusion 

2.  AI  < DT  < AT  on  (SEG, GLOBAL)  POSSIBLE  USAGE  PAIRS  is  more 

significantly  "relaxed"  to  AI  < AT  = DT  than  to  DT  = AI  < AT 
4.  The  AI  < DT  difference  is  more  pronounced  than  the  DT  < AT 

difference  from  AI  < DT  < AT  on  (SEG,GL0?AL)  POSSIBLE  USAGE 
PA1RS\N0NENTRY\UNM0DIFIED 
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The  following  two  paragraphs  provide  a complete  itemization 
of  d i rec t i onl e s s di s t i nc t ions  • The  information  contained  in 
Tables  2 and  4 has  simply  oeen  reorganized  and  presented  in 
English  to  support  a directionless  view  of  the  study's  results. 

Specifically,  for  the  study's  iatatiSQ  comparisons: 

CD  The  distinction 

A I (individuals)  t AT  (ad  hoc  teams)  = DT  (disciplined  teams) 
«as  observed  for  nong  of  the  crocess  aspects  and  for  several 
product  aspects,  including 

- the  raw  count  of  programmer-de f ined  segments  (i.e., 

routines) , 

- the  raw  count  of  programmer-defined  data  variables, 

- several  raw  counts  and  relative  percentages  of  data 

variables  according  to  their  scope  (i.e.,  global, 
parameter,  or  local), 

- the  raw  count  of  potential  segment-global  usage  pairs 

(which  is  strongly  dependent  on  the  raw  counts  of 
segments  and  globals,  both  of  which  are  also  in  this 
category),  and 

- several  "per  segment"  averages  of  other  raw  counts  (i.e., 

formal  parameters,  executable  statements,  and 
nonintrinsic  calls). 

(2)  The  distinction 

AT  (ad  hoc  teams)  t DT  (disciplined  teams)  - AI  (individuals) 
was  observed  for  qqq£  of  the  process  aspects  and  for  several 
product  aspects,  including 

- the  raw  count  of  lines  of  symbolic  source  code, 

- both  the  raw  count  and  relative  percentage  of  IF 

statements, 

- the  raw  count  of  programmed  decisions  (i.e.,  total  number 

of  IF,  CASE,  and  WHILE  statements), 

- the  raw  count  of  RETURN  statements, 

- the  raw  counts  of  calls  to  Intrinsic  routines  and  intrinsic 
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procedures, 

- one  ratio  of  actual  to  possible  access  io  i l it y of  globals  by 

segments,  and 

* the  raw  count  of  possible  communication  paths  between 

segments  via  glooals. 

(3)  The  distinction 

DT  (disciplined  teams)  t AI  (individuals)  = AT  (ad  hoc  teams) 
was  observed  for  nearly  all  the  process  aspects,  including 

- nearly  all  the  raw  counts  of  computer  job  steps,  including 

Doth  the  total  count  and  all  the  subclassification 
counts  (i.e.,  compilations,  executions,  miscellaneous), 
except  for  identical  compilations, 

- both  "per  module"  counts  of  unique  compiles,  the  average 

and  the  (worst  case)  maximum,  and 

- the  amount  of  revision  and  change  made  to  the  source  code 

during  development, 
but  for  ngne  of  the  product  aspects* 

Specifically,  for  the  study's  diversion  comparisons: 

(1)  The  Distinction 

AI  (individuals)  # AT  (ad  hoc  teams)  s DT  (disciplined  teams) 
was  observed  for  one  process  aspect,  namely 

- the  raw  count  of  miscellaneous  computer  job  steps  (i.e., 

auxiliary  compilations  or  executions  of  something  other 
than  the  final  product), 
anc  for  several  product  aspects,  including 

* the  raw  count  and  several  relative  percentages  of  reference 

parameters, 

- a few  raw  counts  of  potential  segment-global  usage  pairs, 

- the  raw  count  of  total  invocations  and  invocations  of 

programmer-defined  routines,  and 

- the  raw  count  of  actual  segment-global-segment  data 

oindings  in  which  neither  segment  could  invoke  the 
other. 

(2)  The  cistinction 

AT  (aa  hoc  teams)  # DT  (disciplined  teams)  * AI  (individuals) 
was  observed  for  nong  of  the  process  aspects  and  for  several 
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prcuuct  aspects,  including 

- two  "per  module"  averages  of  other  raw  counts*  (i.e*, 

segments  and  global  variables  which  .ere  modified  during 
execut ion), 

- the  raw  count  of  executable  statements* 

- both  the  raw  count  and  relative  percentage  of  RETURN 

statement  s* 

- the  average  number  of  calls  made  to  programmer-def ined 

segments  which  were  functions  rather  than  procedures* 

- the  raw  count  of  actual  segment-global  usage  pairs  in  which 

the  global  variable  is  modified  during  execution* 

- the  raw  count  of  potential  segment-global  usage  pairs  in 

which  the  global  variable  is  not  accessible  across 
modules  and  is  modified*  and 

- the  raw  count  of  potential  segment-global-segment  data 

bindings* 

(3)  The  distinction 

DT  (disciplined  teams)  A AI  (individuals)  = AT  (ad  hoc  teams) 
was  observed  for  one  process  aspect*  namely 

- the  (worst  case)  maximum  count  of  unique  compiles  for  any 

one  module* 

and  for  several  product  aspects*  including 

- several  raw  counts  and  relative  percentages  of  data 

variables  according  to  their  scope  (i.e.,  global* 
parameter*  or  local)* 

- the  raw  counts  of  calls  to  procedures  and  to 

programme r-de f i ned  procedures* 

- the  average  number  of  calls  to  built-in  procedures  per 

calling  segment*  and 

- the  average  number  of  tokens  per  statement* 


